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The study of networks has become a substantial interdisciplinary endeavor that encompasses 
myriad disciplines in the natural, social, and information sciences. Here we introduce a framework 
for constructing taxonomies of networks based on their structural similarities. These networks can 
arise from any of numerous sources: they can be empirical or synthetic, they can arise from multiple 
realizations of a single process (either empirical or synthetic), they can represent entirely different 
systems in different disciplines, etc. Because mesoscopic properties of networks are hypothesized to 
be important for network function, we base our comparisons on summaries of network community 
structures. Although we use a specific method for uncovering network communities, much of the 
introduced framework is independent of that choice. After introducing the framework, we apply it to 
construct a taxonomy for 746 networks and demonstrate that our approach usefully identifies similar 
networks. We also construct taxonomies within individual categories of networks, and we thereby 
expose nontrivial structure. For example, we create taxonomies for similarity networks constructed 
from both political voting data and financial data. We also construct network taxonomies to compare 
the social structures of 100 Facebook networks and the growth structures produced by different types 
of fungi. 

Keywords: networks; clustering; community structure 



I. INTRODUCTION 

Although there is a long tradition of scholarship on 
networks, the last two decades have witnessed substan- 
tial advances in network science due to developments in 
physics, mathematics, computer science, sociology, and 
numerous other disciplines [TJ |2] . Given that the ques- 
tions asked by researchers in different fields can be sur- 
prisingly similar, it would be useful to be able to highlight 
similarities in network structures across disciplines in a 
systematic way. One way to approach this is to formu- 
late a suitable means of comparing networks and to use 
this means to develop taxonomies of networks. Such tax- 
onomies have the potential to facilitate the identification 
of problems from different disciplines that might be ap- 
proached similarly in terms of both empirical analyses 
and theoretical modeling. For example, if a biological 
network depicting covariation of neural activity in differ- 
ent regions of the brain is demonstrated to be structurally 
similar to a financial network representing correlations 
of stock returns, then certain types of edge thresholding 
methods or structural null models might be applicable to 



both situations. 

From a historical perspective, classification of objects 
has often been central to the progress of science, as 
demonstrated by the periodic table of elements in chem- 
istry and phylogenetic trees of organisms in biology [3]. 
It is plausible that an organization of networks has the 
potential to shed light on mechanisms for generating net- 
works, reveal how an unknown network should be treated 
once one has discerned its position in a taxonomy, or help 
identify a network family's anomalous members. Further 
potential applications of network taxonomies include un- 
supervised study of multiple realizations of a given model 
process (e.g., characterizing the similarities and differ- 
ences of many different networks drawn from the Erdos- 
Renyi random graph model using the same parameter 
values), examination of multiple empirical networks with 
known similar origins or generative processes, and the de- 
tection of anomalous changes in temporally ordered series 
of networks. In this paper, we develop a framework for 
the creation of network taxonomies [4 . In so doing, we 
develop the requisite diagnostic tools and discuss several 
case studies that suggest how our methodology can help 
illuminate relationships both between and within families 
of networks. 
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In aiming to construct taxonomies of networks, one 
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has to consider the scales at which one wants to com- 
pare differences in network structures. Much research 
has focused on extremes — either microscopic (e.g., node 
degree) or macroscopic (e.g., mean geodesic distance) 
properties — and numerous researchers have, for example, 
reported that many empirical networks possess heavy- 
tailed degree distributions or the small-world property 
[TJ [5]. Given the ubiquity of such findings, it is clear 
that more nuanced approaches are needed to make use- 
ful comparisons between networks. Indeed, interpreta- 
tions of microscopic and macroscopic approaches often 
implicitly assume that networks are homogeneous and 
ignore "mesoscopic" structures in networks. To over- 
come some of these limitations, earlier work has focused 
on the statistics of small, a priori specified modules 
called "motifs" [32] [37] , role-to-role connectivity profiles 
of nodes [8] , the isolation of statistically significant struc- 
tures called "backbones" [9], interrelations of network 
modules [10] , examination of the number of nodes located 
within "shells" [IT] , and the self-similarity of networks as 
characterized by fractal exponents [12 . The taxonomic 
framework that we develop in the present paper builds 
on the idea of examining network modules by computing 
community structures [13] [14] , as was also done in the 
work of [15] . and we subsequently compare signatures 
derived from community structure across networks. Im- 
portantly, although we use a specific method to uncover 
network communities, much of the introduced framework 
is independent of that choice. Consequently, our com- 
parative framework can accommodate a large variety of 
community detection schemes. 

The remainder of this paper is organized as follows. 
First, we discuss the detection of communities in net- 
works in order to find coherent groups of nodes that 
are densely connected to each other. We then intro- 
duce mesoscopic response functions (MRFs), which allow 
us to probe how the community structure of a network 
changes as a function of a resolution parameter that de- 
termines network scales of interest. We then illustrate 
MRFs using several examples of networks and compare 
the MRFs for several well-known generative models of 
networks. We use MRFs to develop a means to measure 
distance between a pair of networks, and use this com- 
parative measure to cluster networks and thereby develop 
taxonomies. Using 746 networks from numerous differ- 
ent fields, we construct a taxonomy of these networks. 
We then construct taxonomies of networks within fields 
using several case studies: voting in the United States 
Senate, voting in the United Nations General Assembly, 
Facebook networks at US universities, fungal networks, 
and networks of stock returns in the New York Stock 
Exchange. In each example, we expose structure that is 
either illuminating or can be checked against information 
from an external source (e.g., previously published inves- 
tigations). This suggests that our method for comparing 
networks is capturing important similarities and differ- 
ences. We conclude with a brief summary and discussion 
of our results. In addition, we provide further details 



in the Appendices and Supplemental Material. Among 
other topics, we examine the robustness of the obtained 
taxonomies, address some computational issues, tabulate 
some of the basic properties of the networks that we in- 
vestigated, and provide references for the network data 
sources used in this study. 

II. MULTI-RESOLUTION COMMUNITY 
DETECTION 

Our approach is based on network community structure 
[13] [14] . A community consists of a set of nodes for which 
there are more edges (or, in the case of weighted net- 
works, a greater total edge weight) connecting the nodes 
in the set than what would be expected by chance. The 
algorithmic detection of communities is a particularly ac- 
tive area of network science, in part because communi- 
ties are thought to be related to functional units in many 
networks and in part because they can strongly influence 
dynamical processes that operate on networks [13] [14] . 

In this paper, we detect communities using the multi- 
resolution Potts method [13] [14] [16] , a generalization of 
modularity optimization [H El HI US HI] • (Modular- 
ity optimization is perhaps the most popular approach for 
detecting communities.) Given a network adjacency ma- 
trix Ajj, we find communities by minimizing the Hamil- 
tonian of the infinite-range A/"-state Potts spin glass 

= -Y t {A ii -\P ij )6{C i ,C i ), (1) 

where d indicates the community (state) of node (spin) 
z, A is a resolution parameter, and J (A) is the coupling 
matrix with entries Jij(X) representing the interaction 
strength between node i and node j in the Potts Hamil- 
tonian. We use the (undirected-network) null model 
Pij = kikj /(2m), where hi denotes the strength (total 
edge weight) of node i and m is the total edge weight in 
the network [9]. By tuning the resolution parameter A, 
we can detect communities at multiple scales of a net- 
work. Our particular choice of implies that we are 
optimizing modularity (with the addition of the resolu- 
tion parameter) [13] [14] . 

To compare networks, we create profiles of summary 
statistics that characterize the community structure of 
each network at different mesoscopic scales. We also 
study a wide variety of networks that contain different 
numbers of nodes and edges. (We enumerate the net- 
works that we consider in Table II of the Supplemental 
Material.) To ensure that we can compare the profiles 
for different networks, we sweep the resolution parame- 
ter A from a minimum value A m i n to a maximum value 
Amax (discussed in detail below). We define these quan- 
tities separately for each network such that the number 
of communities n into which the network is partitioned 
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is 1 at A min and is equal to the total number of nodes TV 
at A max . In other words, one can think of A as a param- 
eter that controls the fragmentation of a network into 
communities. 

To find the minimum and maximum resolution- 
parameter values, consider the interactions in Eq. 0. 
An interaction is called ferromagnetic when > and 
antiferromagnetic when < 0. For each pair of nodes 
i and j, we find the resolution A = A^- at which the 
interaction J^- is neutral (i.e., J^-(A^) = 0), leading to 
Aij = Aij/Pij. We thereby identify two special resolu- 
tions: 



A n 



:{A^(A) = 1}, 



A 



max = max {A i:/ } 



(2) 
(3) 



where e > is any small number (we use e = 1(T 6 in the 
present paper). The resolution A m i n is the largest A^ 
value for which community detection yields a single com- 
munity; note that this need not be the minimum non-zero 
value of A^. Including the small number e in the defini- 
tion of A max ensures that all edges are antiferromagnetic 
at resolution A = A max and thereby forces each node into 
its own community. 



III. MESOSCOPIC RESPONSE FUNCTIONS 
(MRFS) 

To describe how a network disintegrates into commu- 
nities as the value of A is increased from A m i n to A max 
(see Fig. [TJa) for a schematic), one needs to select sum- 
mary statistics. There are many possible ways to summa- 
rize such a disintegration process, and we focus on three 
diagnostics that characterize fundamental properties of 
network communities. 

First, we use the value of the Hamiltonian %{X) 0, 
which is a scalar quantity closely related to network mod- 
ularity and quantifies the energy of the system [T3l [14] . 
Second, we calculate a partition entropy S(X) to charac- 
terize the community size distribution. To do this, let nk 
denote the number of nodes in community k and define 
Pk = rik/N to be the probability to choose uniformly at 
random a member node of community k. This yields a 
(Shannon) partition entropy of S(X) = — Ylk=i Pk ^°E>Pk, 
which quantifies the disorder in the associated commu- 
nity size distribution. Third, we use the number of com- 
munities T] (A). 

Needing to normalize S, and r] to compare them 
effectively across networks, we define an effective energy 



Heff(A) 
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where H min %(A min ) and H max H(A max ); an effec- 
tive entropy 



Seff(A) 



S(X) - S min = S(X) 

Sm\n log N ' 



(5) 



where 5 min £(A min ) and 5 max = S(A n 
effective number of communities 



^eff(A) 



where f] min rj(A min ) and r] n 
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Some networks contain a small number of entries A^ 
that are orders-of-magnitude larger than most other en- 
tries. For example, in the network of Facebook friend- 
ships at Caltech [21], 98% of the A^- entries are less than 
100, but 0.02% of them are larger than 8000. These large 
A^ values arise when two low-strength nodes become 
connected. Using the null model P^ = kikj/(2m), the 
interaction between two nodes i and j becomes antiferro- 
magnetic when A > Aij/ Pi j = 2m /(kikj). If the net- 
work has a large total edge weight but both i and j have 
small strengths compared to other nodes in the network, 
then A needs to be large to make the interaction antifer- 
romagnetic. In prior studies, network community struc- 
ture has been investigated at different mesoscopic scale 
by considering plots of various diagnostics as a function 
of the resolution parameter [T3l [HJ US] • In the present 
example, such plots would be dominated by interactions 
that require large resolution-parameter values to become 
antiferromagnetic. To overcome this issue, we define the 
effective fraction of antiferromagnetic edges 



£ A (\)-£ A (A min ) 
-^(A max ) -^(A mm ) 



G [0, 1] 



(7) 



-'max ^mm 



where i A (X) is the total number of antiferromagnetic in- 
teractions for the given value of A in the network. In 
other words, it is the number of A^ elements that are 
smaller than A. Thus, i? A (A mm ) is the largest number 
of antiferromagnetic interactions for which the network 
still forms a single community, and the effective num- 
ber of antiferromagnetic interactions £(A) is the number 
of antiferromagnetic interactions (normalized to the unit 
interval) in excess of £ A (A m i n ). The function £(A) in- 
creases monotonically in A. 

Sweeping A from A m i n to A max corresponds to sweep- 
ing the value of £ from to 1. (One can think of A as 
a continuous variable and £ as a discrete variable that 
changes with events.) As we perform such sweeping for 
a given network, the number of communities increases 
from 77 (£ = 0) = 1 to r)(£ = 1) = N and yields a vec- 
tor (% e ff(0> *Seff(0> ^eff(O) wnose components we call 
the mesoscopic response functions (MRF) of that net- 
work. Because H e ff G [0, 1], 5 e ff ^ [0? 1]> ^eff £ [0, 1], and 
£ G [0, 1] for every network, we can compare the MRFs 
across networks and use them to identify groups of net- 
works with similar mesoscopic structures. In Fig. [TJb), 
we show the Zachary Karate Club network [83] for dif- 
ferent values of £. As more edges become antiferromag- 
netic, the network fragments into smaller communities, 
and panel (c) shows the corresponding MRFs. In Fig. [2j 
we show a schematic of the MRF in which we emphasize 
its interpretation as a 3-dimensional vector. In Fig. [3J we 
show example MRFs for several other networks. 
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FIG. 1. (Color online) (a) Schematic of some of the ways 
that a network can break up into communities as the value 
of A (or £) is increased, (b) Zachary Karate Club network 
[83] for different values of the effective fraction of antiferro- 
magnetic edges £. All interactions are either ferromagnetic or 
antiferromagnetic, i.e. for the values of £ used, there are no 
neutral interactions. We color edges in blue if the correspond- 
ing interactions are ferromagnetic, and we color them red if 
the interactions are antiferromagnetic. We color the nodes 
based on community affiliation, (c) The H e ff, S e ff, and r/ef? 
MRFs, and the interaction matrix J for different values of £. 
We color elements of the interaction matrix by depicting the 
absence of an edge in white, ferromagnetic edges in blue, and 
antiferromagnetic edges in red. 



Although minimizing Eq. ([I]) is an NP-hard problem 
[23] and T~L possesses a complicated landscape of local op- 
tima for many networks [24], there exist numerous good 
computational heuristics that make finding a nearly- 
optimal partition of the network into communities at a 
given resolution computationally tractable [l3j[T4]. Thus 
far, we have reported results that were obtained by opti- 
mizing modularity using the locally greedy Louvain algo- 
rithm [25] because its speed was important for studying 
large networks. We have compared the results that we re- 
port in the present work to those obtained from optimiz- 
ing modularity using spectral and simulated-annealing 
algorithms, and obtained similar MRFs and taxonomies 
for them (see Appendix (Bp for more details). 



IV. EXAMPLES OF MRFS 

The shapes of the MRFs summarize many factors — 
including the fraction of possible edges in a network that 
are actually present, the relative weights of inter- ver- 




FIG. 2. (Color online) The mesoscopic response function 
(MRF) of a given network consists of a 3-dimensional vector 
0H e ff(£)> Ses(£), ?7eff(0)> where £ G [0,1]. By construction, 
the MRF starts from the bottom front corner [H e &(€ = 0), 
5eff(£ = 0), ?7eff(£ = 0)] and ends at the top back corner 
[H eff (£ = 1), S eff (£ = 1), W£ = !)]■ The colored surface 
plot shows where most MRFs lie. We also show schematic 
MRFs in blue (solid curve) and red (dashed curve). 



sus intra-community edges, the edge weights compared 
with the expected edge weights in the null model, the 
number of edges that need to become antiferromagnetic 
for a community to fragment, and the way in which the 
communities fragment (e.g., whether a community splits 
in half or a single node leaves a community when a par- 
ticular edge becomes antiferromagnetic). To understand 
the effects of some of these factors on the shapes of the 
MRFs, we consider some examples. 

Of particular interest are plateaus in the r] e ^ and S'eff 
curves that are accompanied by large increases in H e ff- 
As illustrated in panel [3^ a), the New York Stock Ex- 
change (NYSE) network from 1980 to 1999 [22 provides 
a good example of this behavior. This network is an 
instance from the category of similarity networks. We 
use this label to describe networks that have been con- 
structed by starting from some node-level quantity or at- 
tribute and then defining the edges based on some form 
of similarity or correlation measure between each pair 
of nodes. Similarity networks tend to be complete (or 
almost complete) and weighted networks, except when 
they have been deliberately thresholded. In this par- 
ticular example, each node represents a stock, and the 
strength of the edge connecting stocks i and j is linear 
in the Pearson correlation between the daily logarithmic 
returns of the stocks. (See Section 



IX E 



for more details.) 
Plateaus imply that as the resolution A is increased (lead- 
ing to an increase in H e ff), the communities remain un- 
changed even though the number and strength of antifer- 
romagnetic interactions increase. As A is increased and 
more interactions become antiferromagnetic, there is an 
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increased energy incentive for communities to break up. 
Community partitions in such plateaus tend to be robust 
and have the potential to represent interesting structures 
[13 [H [II [27]. 

In Fig. [3^b), we show MRFs for a "fractal" network 
[4] , which demonstrates that plateaus in the r/ e ff and S e s 
curves need not be accompanied by significant changes 
in 'Heff- Such plateaus can be explained by considering 
the distribution of values. If several interactions have 
identical values of A^ , then the interactions all become 
antiferromagnetic at exactly the same resolution value. 
This leads to a significant increase in the effective fraction 
of antiferromagnetic edges £ but only a small change in 
H e ff- If these interactions do not result in additional 
communities, then we obtain plateaus in the rj e fi and S e fi 
curves. 

To demonstrate qualitatively different behavior, we 
show the MRFs for the Biogrid Drosophila melanogaster 
network and the Garfield Scientometrics citation network 
in Fig. J3^c) and Fig. |3jd), respectively. A common fea- 
ture in these MRFs is the sharp initial increase in the 
curves that results from the networks initially breaking 
into two communities. 

Another family of networks, which we will discuss in 
more detail in our case studies, are political voting net- 
works. These voting networks are also similarity net- 
works: we have constructed these networks so that an 
edge between two nodes indicates the level of agreement 
on votes between two entities, and each edge takes a value 
between and 1. In Fig.[3je), we show the MRFs for the 
voting network of the United Kingdom House of Com- 
mons during the period 2001-2005 [53]; in Fig. j3^f), we 
show the MRFs for the roll-call voting network for the 
108 th (2003-2004) United States House of Representa- 
tives [30, 50-52 4 . In both cases, we observe that sharp in- 
creases in 'Hgff can be accompanied by only small changes 
in r] e ff and S e ff. To see how this can arise, we again 
consider the distribution of A^- values. If the A^- dis- 
tribution is multi-modal, there can be a large difference 
between consecutive A^ values. A large increase in A 
is then needed to increase £, which in turn results in a 
large change in . However, the change in r] e ^ is small 
because this only results in a single additional antiferro- 
magnetic interaction. 



V. COMPARING NETWORK MODELS 

To provide further insights into MRFs, we consider 
Erdos-Renyi (ER) [J, Barabasi- Albert (BA) [3], and 
Watts- Strogatz (WS) [2] networks. These network mod- 
els are stochastic, and there is a large ensemble of pos- 
sible network realizations for each choice of parameter 
values in these models. However, even with the ensu- 
ing structural variation, networks generated by a given 
one of these three models exhibit similar properties at 
mesoscopic and macroscopic scales, so we expect MRFs 
for different realizations of a given model to be similar. 
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FIG. 3. (Color online) Example mesoscopic response func- 
tions (MRFs). The curves show H e ff (pink, dashed), S e f£ 
(blue, dash-dotted), and ?7eff (black, solid) as a function of the 
effective fraction of antiferromagnetic edges £ for the following 
networks: (a) New York Stock Exchange (NYSE), 1980-1999 
[22] : (b) Fractal (10,2,8) 4 ; (c) Biogrid D. melanogaster [55]; 
(d) Garfield scientometrics citations [40]; (e) United Kingdom 
House of Commons voting, 2001-2005 5 3 ; (f) Roll-call voting 
of 108th United States House of Representatives [30, 50H52], 



In Fig. [4] we compare the MRFs for 1000 realizations 
of each model for networks with N = 1000 nodes and 
mean degree (k) = 10. For the WS networks, we set 
the edge rewiring probability at p = 0.1. As illustrated 
in Fig. [4] we obtain a narrow range of possible MRFs 
for fixed parameter values. This comparison illustrates 
that the MRF profiles of the three different models are 
distinctive. In addition, for each model there is little 
variation in the behavior of the MRFs across different 
network realizations with the same parameter values. 




0.5 10 0.5 10 0.5 1 
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FIG. 4. (Color online) MRFs for 1000 realizations of Erdos- 
Renyi (ER), Barabasi- Albert (BA), and Watts- Strogatz (WS) 
networks. Each network has N = 1000 nodes and mean de- 
gree (k) = 10. For each value of £, the upper curves show 
the maximum values of 'Heff (top row), S e s (middle row), 
and ?7eff (bottom row) for all networks in the ensemble; the 
lower curves show the corresponding minimum value, and the 
dashed curves show the corresponding mean. 
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It is also instructive to consider variation in MRF 
shapes for a particular network model for different pa- 
rameter values. We focus on WS networks because they 
illuminate the effect of the distribution of values on 
the shapes of the MRFs. In Fig.[5j we show MRFs for WS 
networks for different values of the edge rewiring proba- 
bility p. (We continue using N — 1000 and (k) = 10.) We 
also show the distribution of A^ values for each network. 

For small rewiring probabilities, the MRFs have lots 
of steps. As with prior examples, we can see how this 
feature arises by considering the distribution of A^ val- 
ues. When the rewiring probability is small, many nodes 
possess the same degree, which results in the presence 
of many interactions with identical A^- values (see the 
bottom left panel of Fig. [5J. Because several interac- 
tions have identical A^ values, these interactions all be- 
come antiferromagnetic at exactly the same resolution- 
parameter value, so the behavior of MRFs only changes 
for a small number of £ values. As the rewiring probabil- 
ity p is increased, the degree and A^- distributions become 
more heterogeneous, which leads to smoother MRFs. For 
a rewiring probability of p = 1, the WS network is just 
an ER network. 




FIG. 5. (Color online) Upper panels: MRFs for Watts- 
Strogatz networks for different values of the rewiring proba- 
bility p. Each network has N = 1000 nodes and mean degree 
(k) = 10. Lower panels: distributions of Aij values for each 
network. As expected, the MRFs for p = 1 are identical to 
those of an Erdos-Renyi network with N = 1000 and (k) = 10. 



VI. MEASURING DISTANCE BETWEEN 
NETWORKS 

In the framework that we have introduced in this pa- 
per, comparing two networks at the mesoscopic level 
amounts to characterizing the differences in behavior of 
the corresponding MRFs. To quantify such differences, 
we define a distance between two networks with respect 
to one of the summary statistics as the area between the 
corresponding MRFs. For example, the distance between 
two networks i and j with respect to the effective energy 
H e ff is given by 

<%= /VyO-^OI^. (8) 

Jo 



For the effective entropy and effective number of com- 
munities, the distances are given by dfj = |*5g ff (£) — 
SyOI d£ and 4 = £ |r4(0 - vU0\d^, respectively. 

We represent the resulting three sets of distances (com- 
puted for each pair of networks from the 746 networks 
that we consider, see Table I) in matrix form as D^, 
D^, and D 77 . These distance measures have several de- 
sirable properties. First, they compare MRFs across all 
network scales (i.e., for all values of £); second, each dis- 
tance is bounded between and 1; third, the distances 
are easy to interpret, as each of them corresponds to the 
geometric area between (a certain dimension of) a pair 
of MRFs; and finally, we find a posteriori that these dis- 
tances can be used to cluster networks accurately (see 
the discussions below). 

We have computed MRFs for the energy entropy S, 
and number of communities 77, but we can proceed simi- 
larly with any desired summary statistic. If two diagnos- 
tics provide similar information, then one of them can 
be excluded without significant loss of information. We 
checked whether the summary statistics were sufficiently 
different, for the set of networks considered here, for it to 
be worthwhile to include all of them by calculating the 
Pearson correlation coefficient between their correspond- 
ing distance measures. The correlations between the 
pairs of distances are r(d^, dfj) = 0.36, r(d^, d^-) = 0.24, 
and r(dfj^d r lj) = 0.58. These correlations are not suf- 
ficiently high to justify excluding any of the summary 
statistics. 

In the interest of parsimony — and given the non- 
vanishing correlations between the distance measures — 
we reduce the number of distance measures using prin- 
cipal component analysis (PC A) [39]. Starting with AT 
networks, we create a ^J\f(J\f — 1) x 3 matrix in which 
each column corresponds to the vector representation 
of the upper triangle of one of the distance matrices 
D'H, j^s j^t] anc j we p er f orm a PC A on this matrix. 
We then define a distance matrix D p with elements 
dp- = w-udlj + wsdfj + Wrjdij, where the weights are the 
coefficients for the first principal component, and we nor- 
malize the sum of squared coefficients to unity. The co- 
efficients are w-u = 0.24, ws = 0.79, and w v = 0.57. The 
first component accounts for about 69% of the variance, 
so the distances D p provide a reasonable single-variable 
projection of the distances D^, D 5 , and D 77 . 

It is important that the distance measures for compar- 
ing networks are robust to small perturbations in network 
structure. Because many of the networks that we study 
are constructed empirically, they might contain false pos- 
itives and false negatives. In other words, the networks 
might falsely identify a relationship where none exists, 
and they also might fail to identify an existing relation- 
ship. Consequently, the topology and edge weights of an 
observed network might be slightly different than those 
of the actual underlying network. To test the robustness 
of our distance measures to such observational errors, we 
recalculate the MRFs for a subset of relatively small un- 
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weighted networks in which, for each network, we rewire 
a number of edges corresponding to a given percentage of 
the total number of edges (5%, 10%, 20%, 50%, or 100%). 
See Appendix [A] for more details. (We study networks 
with up to 1000 nodes and only consider a subset of 25 
networks because of the computational costs of rewiring 
a large number of networks multiple times; however, we 
have performed the same investigation for 5 different sub- 
sets of 25 networks and obtained similar results. We list 
the networks in each subset in Table I of the Supplemen- 
tal Material.) We investigate two rewiring mechanisms: 
one in which the degree distribution is maintained, where 
we also ensure after each rewiring that the network forms 
a single connected component; and another in which the 
only constraint is that the network continues to consist 
of a single connected component after each edge rewiring 
[40] . We find in both cases that the structures of the 
block-diagonalized distance matrices for the 25 networks 
(see Figs 



14 



and 



15 in Appendix [A]) are robust to random 
perturbations of the networks, thereby suggesting that 
our MRF distance measures are not sensitive to small 
structural perturbations. 



VII. CLUSTERING NETWORKS 



TABLE I. Network categories, the total number of networks 
assigned to each category, and the number of networks from 
each category included in the taxonomy in Fig. [6] For the 
full taxonomy that uses all 746 networks, see Fig. 1 of the 
Supplemental Material. 
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We assign each of the 746 networks to a category based 
on its type (see Table [l]). Due to the varying availability 
of different types of network data, the included networks 
are not evenly distributed across these categories. Many 
of the networks are either different temporal snapshots 
of the same system or different realizations of the same 
type of network. To have a more balanced distribution 
across the different categories, we focus on 189 of the 746 
networks. We only include categories for which we have 
8 or more networks, and we selected a subset of networks 
(uniformly at random) from the larger categories. We 
also exclude all synthetic networks. See Section IV of 
the Supplemental Material for the list of networks that 
we consider and Fig. 1 in Section II of the Supplemen- 
tal Material for a dendrogram showing a taxonomy we 
constructed using all 746 networks. 

Our primary reason for assigning each network to a 
category is to use such an external categorization to help 
assess the quality of taxonomies produced by the unsu- 
pervised MRF clustering. For each way of computing 
distance, we construct a dendrogram for the set of net- 
works using average linkage clustering, which is an ag- 
glomerative hierarchical clustering technique [13, 41] [42]. 
In Fig. |6j we show a dendrogram obtained from the dis- 
tance matrix D p . The colored rectangle underneath each 
leaf indicates the network category. Contiguous blocks 
of color demonstrate that networks from the same cat- 
egory have been grouped together using the MRF clus- 
tering method, and the presence of such contiguous color 
blocks is an indication of the success of the MRF clus- 
tering scheme. 

The assignment of the networks to one of these cate- 



gories is of course to some extent subjective, as several 
of the networks could belong to more than one category. 
For example, we could categorize the network of jazz mu- 
sicians [20] as either a collaboration network or a social 
network. The initial selection of network categories is 
also somewhat subjective. One could argue that if one 
has a social network category, then it is not necessary 
to have a collaboration network category as well because 
a collaboration network is a type of social network. We 
have attempted to maintain a balance between having 
too many categories and having too few of them. When 
such ambiguities have arisen, we have systematically cho- 
sen the more specific of the relevant categories (e.g., we 
placed the jazz musician network in the category of col- 
laboration networks rather than in the category of social 
networks) . 



VIII. TAXONOMIES OF EMPIRICAL 
NETWORKS 

All of the networks in some categories appear in blocks 
of adjacent leaves in the dendrogram in Fig. [6] For exam- 
ple, there is a cluster of political voting networks at the 
far left of the dendrogram. This cluster includes voting 
networks from the US Senate, the US House of Repre- 
sentatives, the UK House of Commons, and the United 
Nations General Assembly (UNGA). The clustering of 
these voting networks suggests that there are some com- 
mon features in the network representations of the differ- 
ent legislative bodies. We also obtain blocks that consist 
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FIG. 6. (Color online) Taxonomy for 189 networks. We construct the dendrogram (tree) using the distance D p and average 
linkage clustering. We order the leaves of the dendrogram to minimize the distance between adjacent nodes and color the leaves 
to indicate the type of network. 



of all political committee networks and all metabolic net- 
works. 

There are also several categories for which all except 
one or two networks cluster into a contiguous block. For 
example, all but two of the fungal networks appear in 
the same block and all but one of the Facebook networks 
are clustered together. The isolated Facebook network 
is the Caltech network, which is the smallest network of 
this type and which appears in a group next to that con- 
taining all of the other Facebook networks. We remark 
that the social organization of the community structure 
of the Caltech Facebook network has been shown to be 
different from those of the other Facebook networks [21] . 

Networks of certain categories do not appear in near- 
contiguous blocks. For example, protein interaction net- 
works appear in several clusters. These networks rep- 
resent interactions within several different organisms, so 
we would not expect all of them to be clustered together. 
Moreover, the data that we employed includes examples 
of protein interaction networks for the same organism in 
which the interactions were identified using different ex- 
perimental techniques, and these networks do not cluster 
together. This supports previous work suggesting that 
the properties of protein interaction networks are very 
sensitive to the experimental procedure used to identify 



the interactions [44j [45]. Social networks are also dis- 
tributed throughout the dendrogram. This is unsurpris- 
ing given the extremely broad nature of the category, 
which includes networks of very different sizes with edges 
representing a diverse range of social interactions. The 
leftmost outlying social network is the network of Marvel 
comic book characters [72] , which is arguably an atypical 
social network. 

The grouping (and, to some extent, the non-grouping) 
of networks by category suggests that the PCA-distance 
D p between MRFs of different networks produces a sen- 
sible taxonomy. It is important to ask, however, whether 
a simpler approach based on a single network diagnos- 
tic, such as edge density, can be comparably successful 
at constructing a taxonomy. In Appendix [D] we demon- 
strate using some well-known diagnostics that this does 
not appear to be the case, as the diagnostics we tried 
were unable to reproduce or explain the classifications 
that we produced using the MRFs. 

In order to compare the aggregate shapes of the MRFs 
across categories, we show the bounds of the H e ff, Seff, 
and Tfeff curves for each category in Fig. (7) We again 
consider all empirical network categories with at least 8 
networks in them. This illustrates that the MRFs for 
some classes of networks (such as political cosponsor- 



9 



ship and metabolic networks) are very similar to each 
other, whereas there are large variations in the MRFs for 
other categories (such as social and protein interaction 
networks). The variety of different MRFs for the social 
and protein interactions is consistent with the fact that 
their constituent networks are scattered throughout the 
dendrogram in Fig. [6j 
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FIG. 7. (Color online) MRFs for all of the network categories 
containing at least 8 networks (see Table At each value of 
£, the upper curve shows the maximum value of H e ff (pink, 
left panel in each category), S e f£ (blue, center panel), and ?7 e ff 
(black, right panel) for all networks in the category and the 
lower curve shows the minimum value. The dashed curves 
show the corresponding mean MRFs. 



IX. CASE STUDIES 

We now consider several case studies, in which we gen- 
erate taxonomies for multiple realizations of particular 
types of networks and multiple time slices of particular 
networks. This enables us to compare these networks and 
(in some cases) illustrate possible connections between 
network function and mesoscopic network structure. 



A. Voting in the United States Senate 

Our first example deals with roll-call voting in the 
United States Senate [30 , 47, 50-52 . Establishing a tax- 
onomy of networks detailing the voting similarities of in- 
dividual legislators complements previous studies of these 
data, and it facilitates the comparison of voting similar- 
ity networks across time. We consider Congresses 1-110, 
which cover the period 1789-2008. As in Ref. [50], we 
construct networks from the roll-call data [30] [51] for 
each two-year Congress such that the adjacency matrix 
element G [0, 1] represents the number of times Sen- 
ators i and j voted the same way on a bill (either both in 



favor of it or both against it) divided by the total num- 
ber of bills on which both of them voted. Following the 
approach of Ref. [51] , we only consider "non-unanimous" 
roll call votes, which are defined as votes in which at least 
3% of the Senators were in the minority. 

Much research on the US Congress has been devoted 
to the ebb and flow of partisan polarization over time 
and the influence of parties on roll-call voting [50] [52] . 
In highly polarized legislatures, representatives tend to 
vote along party lines, so there are strong similarities in 
the voting patterns of members of the same party and 
strong differences between members of different parties. 
In contrast, during periods of low polarization, the party 
lines become blurred. The notion of partisan polarization 
can be used to help understand the taxonomy of Senates 
in Fig. [8] in which we consider two measures of polar- 
ization. The first measure uses DW-Nominate scores (a 
multi-dimensional scaling technique commonly used in 
political science [51j[52]), where the extent of polariza- 
tion is given by the absolute value of the difference be- 
tween the mean first dimension DW-Nominate scores for 
members of one party and the same mean for members 
of the other party [30] [5TJ [52] . In particular, we use the 
simplest such measure of polarization, called MPR polar- 
ization, which assumes a competitive two-party system 
and hence cannot be calculated prior to the 46 th Senate. 
The second measure we consider is network modularity 
Q, which was recently shown to be a good measure of 
polarization even for Congresses without clear party di- 
visions [50 . Modularity is given in terms of the energy T~L 
in Eq. dl]) by Q = —H(X = l)/(2m). These two measures 
exhibit fairly close agreement on the level of polarization 
of each Congress for which they can both be calculated 

EQ]. 

In Fig.[8ja), we include bars under the dendrograms to 
represent the two polarization measures, both of which 
have been normalized to lie in the interval [0,1]. The 
bars demonstrate that Senates with similar levels of po- 
larization (measured in terms of both DW-Nominate 
scores and modularity values) are usually assigned to the 
same group, suggesting that our MRF clustering tech- 
nique groups Senates based on the polarization of roll-call 
votes. We have also colored dendrogram groups accord- 
ing to their mean levels of polarization using modularity, 
where the brown group in the dendrogram corresponds 
to the most highly polarized Senates and the blue group 
corresponds to the least polarized Senates. Although 
one ought to expect similarity in the results from the 
modularity-based measure of polarization and the MRF 
clustering, it is important to stress that the MRF cluster- 
ing method is based on different principles; modularity 
quantifies the extent to which a given network is "mod- 
ular", whereas the MRF clustering explicitly compares 
the differences in modular structures between any two 
networks at all scales. 

In Fig. |8ja), we also show the clusters that we ob- 
tained for the Senate. They closely match the different 
periods of polarization that have been identified using 
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FIG. 8. (Color online) (a) Dendrogram for Senate roll-call 
voting networks for the 1 st — 110 th Congresses. Each leaf in 
the dendrogram represents a single Senate. Two horizontal 
color bars below the dendrograms indicate polarization mea- 
sured in terms of modularity (upper bar) and DW-Nominate 
scores (lower bar). We color the branches in the dendrogram 
corresponding to periods of similar polarization, (b) Polar- 
ization of the US Senate as a function of time. The height 
of each stem indicates the level of polarization measured us- 
ing modularity, and the color of each stem gives the cluster 
membership of each Senate in (a). The black curve shows the 
DW-Nominate polarization. Note that we have rescaled both 
measures to the interval [0, 1]. 



modularity and DW-Nominate [50]. The cluster with 
the highest mean polarization (shown in brown) consists 
of Senates 7, 26-29, 44, 46-51, 53, 55, 66, and 104-110. 
The 104 th -110 th Con gresses correspond to a period of ex- 
tremely high polarization following the 1994 "Republican 
Revolution" , in which the Republican party earned ma- 
jority status in the House of Representatives for the first 
time in more than 40 years [30 , 50, 52 . The cluster with 
the second highest mean polarization (shown in red) in- 
cludes several contiguous blocks of Senates, such as those 
from Congresses 21-25, 35-39, and 56-61. The 21 st -25 th 
Congresses (1829-1839) corresponded to a period of par- 
tisan conflict between supporters of John Quincy Adams 
and Andrew Jackson; it lasted until the emergence of the 
Whigs and the Democratic party in the 25 th Congress 
[48| 150] . The American Civil War started during the 
37 th Congress, and a third party known as the Populist 
Party was strong during the 56 th -58 th Congresses. 

The main differences between different clusters occur 
in the T~L e ^ response functions. For the most polarized 
Senates, there is a sharp shoulder in the 7^ e ff MRF that 



becomes less pronounced as the polarization decreases. 
We illustrate this in Fig. [9j in which we compare the 
H e ff MRFs for the (low-polarization) 85 th and (high- 
polarization) 108 th Senates. The shoulder in the H e ff 
curve for the 108 th Senate is very pronounced, which can 
be explained by considering the distribution of A^ values. 
The 108 th Senate has a bimodal A^ distribution that con- 
tains a trough at A^ = 1. Recall that A^- = Aij/Pij, so 
Aij compares the observed voting similarity of legisla- 
tors i and j with the similarity Pij = kikj/(2m) expected 
from random voting. If A^- < 1, legislators i and j vote 
differently more frequently than expected (with respect 
to the chosen null model); if A^- > 1, they vote more 
similarly than expected. Therefore, the peaks in the A^ 
distribution above and below 1 correspond, respectively, 
to intra-party and inter-party voting blocs. In a Senate 
with low polarization, legislators from different parties 
often vote in the same manner, so the values of A^- no 
longer separate two distinct types of behavior. 

We also examined roll-call voting networks in the US 
House of Representatives and found many similar fea- 
tures as the ones that we have presented for the US Sen- 
ate. For example, the highly polarized 104 th -110 th Con- 
gresses, which followed the "Republican Revolution" , ap- 
pear in the same cluster for both the House and Senate. 
We also observed some differences in the clusters for the 
two chambers. For example, the 78 th -102 nd Senates all 
appeared in the same cluster. For the House, however, 
Congresses 80, 88, 89, and 98-102 did not appear in the 
same cluster as the other Congresses between 78 and 102; 
instead, they appeared in a cluster that also included the 
26 th -28 th Houses. This was a particularly eventful pe- 
riod: the 25 th Congress saw the emergence of the Whigs 
and the Democratic Party, and the abolitionist movement 
was also prevalent (e.g., the Amistad seizure occurred in 
1839 during the 26 th Congress). 




FIG. 9. (Color online) Comparison of the (low-polarization) 
85 th Senate and the (high-polarization) 108 th Senate. The 
panels show (a) the l-L e E MRFs and (b) the cumulative dis- 
tributions of An values. 
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B. Voting in the United Nations General Assembly 

The United Nations General Assembly (UNGA) is one 
of the principal organs of the United Nations (UN), and 
it is the only part of the UN in which all member na- 
tions have equal representation. Although most resolu- 
tions are neither legally nor practically enforceable be- 
cause the General Assembly lacks enforcement powers on 
most issues, it is the only forum in which a large number 
of states meet and vote regularly on international issues. 
It also provides an interesting point of comparison with 
roll-call voting in the US Congress, as the level of agree- 
ment on UN resolutions tends to be much higher than 
that in the Senate and House [49] . 

We study voting for the l st -63 rd sessions (covering the 
period 1946-2008), where each session corresponds to a 
year [50 j. For each session, we define an adjacency ma- 
trix A whose elements represent the number of times 
countries i and j voted in the same manner in a session 
(i.e., the sum of the number of times both countries voted 
yea on the same resolution, both countries voted nay on 
the same resolution, or both countries abstained from 
voting on the same resolution) divided by the total num- 
ber of resolutions on which the UNGA voted in a session. 
The matrix A, with elements Aij G [0,1], thereby rep- 
resents a (similarity) network of weighted edges between 
countries. 
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FIG. 10. (Color online) Dendrogram for the United Na- 
tions General Assembly resolution voting network for the 1 st - 
63 rd sessions (excluding the 19 th session), covering the period 
1946-2008. Each leaf in the dendrogram represents a sin- 
gle session. In the text, we discuss the coloring of groups of 
branches in the dendrogram. 



We cluster UNGA sessions by comparing MRFs for 
the corresponding voting networks. In Fig. [ToJ we plot a 
dendrogram of the UNGA sessions and highlight some of 
the clusters, which correspond to notable periods in the 
recent history of international relations. The red cluster 
in the middle of the dendrogram consists of all post-Cold 
War sessions (1992-2008) except 1995. This group forms 
a larger cluster with some UNGA sessions from the 1970s 
and a cluster consisting of 1946, 1948, and 1950. These 
three sessions (shown in magenta) are all noteworthy: 



1946 was the first session of the UNGA, the Universal 
Declaration of Human Rights was introduced during the 
1948 session, and the "Uniting for Peace" resolution was 
passed during the 1950 session. At the rightmost part 
of the dendrogram, we color in black a group that con- 
sists of all sessions from 1979 to 1991 (excluding 1980). 
The beginning of this period marked the end of Detente 
between the Soviet Union and the US following the for- 
mer's invasion of Afghanistan at the end of 1979, and 
the end of this period saw the end of the Cold War. The 
large blue cluster in the leftmost part of the dendrogram 
consists primarily of sessions from before 1971 (though it 
also includes the sessions in 1977 and 1995). 



C. Facebook 

We now consider Facebook networks for 100 US univer- 
sities [21 j. The nodes in each network represent users of 
the Facebook social networking site, and the unweighted 
edges represent reciprocated "friendships" between users 
at a single-time snapshot in September 2005. We con- 
sider only edges between students at the same university, 
as this allows us to compare the structure of the networks 
at the different institutions. These networks represent 
complete data sets obtained directly from Facebook. In 
contrast to the previous examples, we are not compar- 
ing snapshots of the same network at different times but 
are instead comparing multiple realizations of the same 
type of network that have evolved independently. Such 
real- world ensembles of network data are rare, and con- 
structing a taxonomy will hopefully allow us to compare 
and contrast the social organization at these institutions. 
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FIG. 11. (Color online) Dendrogram for 100 Facebook net- 
works of US universities at a single-time snapshot in Septem- 
ber 2005. We order the leaves of the dendrogram to minimize 
the distance between adjacent nodes. The color bars below 
the dendrogram indicate (top) the number of nodes in the 
networks N and (bottom) the fraction of possible edges that 
are present d. 

In Fig. [TTJ we show the dendrogram for Facebook net- 
works that we produced by comparing MRFs. The two 
color bars below the dendrogram indicate (top) the num- 
ber of nodes N in each network and (bottom) the frac- 
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tion of possible edges d that are present (i.e., edge den- 
sity). The Facebook networks range in size from 762 to 
41,536 nodes, and the edge density varies from 0.2% to 
6%. In contrast to previous examples, we observe in this 
case that two simple network properties appear to ex- 
plain most of the observed clustering of the networks. 
An important feature of this example is that the H e ff, 
S e ff , and ?7 e ff MRFs are each very similar in shape and lie 
in a narrow range across all 100 institutions (see Fig. [7]). 
Such extreme similarity is remarkable — as one can see 
in Fig. [7| this contrasts starkly with most of the other 
examples — and it suggests that all of the Facebook net- 
works have very similar mesoscopic structural features. 
If one also considers demographic information, then one 
can find interesting differences between the networks [21] , 
but the structural similarity is striking. 



D. Fungi 

We also examined fungal mycelial networks extracted 
from time series of digitized images of colony growth. In 
these undirected, planar, weighted networks, the nodes 
represent hyphal tips, branch points, or anastomoses (hy- 
phal fusions) , and the edges represent the interconnecting 
hyphal cords weighted by their conductivity [27, 52) [53]. 
For comparison, we also digitized weighted networks of 
the acellular slime mold Physarum polycephalum [24] . 
Fungal networks look like trees but contain additional 
edges (known as cross-links) that generate cycles. 

As shown in Fig. [l2^a), we find using our method 
that replicate networks from different species at compa- 
rable time points are grouped together. Furthermore, 
the aggregate clustering pattern reflects increasing lev- 
els of cross-linking that are characteristic of different 
species, as illustrated in Fig. I2[b); this ranges from the 
low levels in Resinicium bicolor to intermediate levels in 
Phanerochaete velutina and highly cross-linked networks 
formed by Phallus impudicus. By constructing a den- 
drogram for only one species but including data from 
repeated experiments and over time (see Fig. [l2^c)), we 
observe a progression from trees at early developmen- 
tal times to an increasingly cross-linked network later in 
mycelium growth [26j [27]. In early growth, the devel- 
opmental stage appears to dominate the clustering pat- 
tern, as networks from different replicates but of similar 
age are grouped together. At later times, however, net- 
works show a high aggregate level of similarity, and the 
fine-grained clustering predominantly reflects the subtle 
changes in structure evolving within each replicate. 



E. New York Stock Exchange 

As our final example, we consider a set of stock-return 
correlation networks for the New York Stock Exchange 
(NYSE), which is the largest stock exchange in the world 
(as measured by the aggregate US dollar value of the 
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FIG. 12. (Color online) (a) Dendrogram of networks for six 
different species of Saprotrophic basidiomycetes and the slime 
mold Physarum polycephalum. Each leaf represents a repli- 
cate experiment. The colors and numbers correspond to the 
species as follows: (1) Resinicium bicolor, (2) Physarum poly- 
cephalum, (3) Phallus impudicus, (4) Phanerochaete velutina, 
(5) Stropharia caerulea, and (6) Agrocybe gibberosa. (b) Im- 
ages illustrating the network structure of the different species 
[52] . (c) Dendrogram of network development in six replicate 
time series of Phanerochaete velutina. We color the leaves by 
time, and the color bar underneath the leaves indicates ex- 
periment number (1, . . . , 6). In the inset, we show extracted 
networks that illustrate the transition from simple branching 
trees to increasing levels of interconnection (i.e., cross-linking) 
with time. 



securities listed on it). Each node represents a stock, 
and the strength of the edge connecting stocks i and j is 
linear in the Pearson product-moment correlation coeffi- 
cient between the daily logarithmic returns of the stocks 
[22] . We consider N = 100 stocks during the time period 
1985-2008 and construct a network for each 6 months of 
data. This yields a sequence of fully-connected, weighted 
adjacency matrices whose elements quantify the similar- 
ity of two stocks (normalized to the unit interval for each 
time window). 

We show the dendrogram for the NYSE networks in 
Fig. 



13 The first division of these networks classifies 



them into two groups (which we have colored in blue and 
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red). The red cluster appears to correspond to periods 
of market turmoil, including the networks for the sec- 
ond half of 1987 (including the Black Monday crash of 
October 1987), all of 2000-2002 (including and follow- 
ing the bursting of the dot-com bubble), and the second 
half of 2007 and all of 2008 (including the recent credit 
and liquidity crisis). The value of the NYSE composite 
index, which measures the aggregate performance of all 
common stocks listed on the NYSE [56 , supports our hy- 
pothesis that the red cluster is associated with periods of 
market turmoil. Indeed, the networks in the red cluster 
correspond (with one or two exceptions) to the periods 
of high volatility of the composite index (see Fig. 13). 




FIG. 13. (Color online) Dendrogram for 48 NYSE networks 
during the period 1985-2008 [22]. Observe the clear split of 
the dendrogram into two clusters (a blue group on the left 
and a red group on the right). Leaf color indicates mean 
daily volatility of the composite index. 



X. CONCLUSIONS 



we have shown in the present paper that one can sys- 
tematically exploit mesoscopic structure to obtain useful 
comparisons of networks. This allows one to derive tax- 
onomies for networks that also appear to have correspon- 
dence with functional similarities. We observed that net- 
works that were not grouped with other members of the 
same class appeared to be unusual in some respects, and 
we also demonstrated that we could detect historically- 
noted financial and political changes from time-ordered 
sequences of networks. 

We believe that our framework has the potential to 
aid in the exploration and exploitation of similarities in 
network structures across both network types and disci- 
plinary boundaries. 
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We have developed an approach that facilitates the 
comparison of diverse networks by summarizing network 
community structure using what we call mesoscopic re- 
sponse functions (MRFs). We have demonstrated how 
this approach can be used to group networks both across 
categories and within categories. Our work builds on 
prior research on network community structure, which 
has focused predominantly on algorithmic detection of 
the communities rather than on subsequently using the 
communities for applications (such as comparing sets of 
networks). 

The development of algorithmic methods to detect 
communities is frequently motivated by the idea that the 
community structure of a network representing a system 
has some bearing on the function of the system. If dif- 
ferent networks perform different functions — and if their 
functions are constrained, at least in part, by their meso- 
scopic structure — then it should be possible in principle 
to derive a functional classification of networks based on 
community structure. Although this has mostly been 
presented as a presumption in the existing literature, it 
is actually an empirically testable hypothesis. Indeed, 



Appendix A: Robustness of Clustering 

To examine the robustness of our clustering to false 
positives (false links) and false negatives (false non-links) , 
we consider two network rewiring mechanisms, and we 
apply the rewiring to each network in a subset of 25 net- 
works highlighted in Table 2 of the Supplemental Ma- 
terial. The first step in the procedure is to randomly 
rewire a number of edges corresponding to a given per- 
centage (5%, 10%, 20%, 50%, or 100%) of the total num- 
ber of edges in the network, subject to the constraints 
that we preserve the networks's degree distribution and 
the fact that it consists of a single connected component 
[57] . (That is, such a rewiring of a number of edges equal 
to x% of the L edges in a network means that we perform 
\xL\ rewiring steps; the same edge can be rewired multi- 
ple times.) Second, we randomly rewire a given number 
of the edges subject only to the constraint that we the 
rewired network still consists of a single component. 

Because we are perturbing the original network, we fo- 
cus on the distance matrices D^, D^, and D 77 as they can 
be calculated directly for each network. We consider 25 
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of the 746 original networks of varying sizes and edge den- 
sities; we highlight these networks in bold in Table II of 
the Supplemental Material. In Fig. [l4j we show the dis- 
tance matrices for this subset of networks when different 
percentages of edges have been rewired with the degree 
distribution preserved. The first column shows the ma- 
trices for the original networks. (Note that the node or- 
derings for D^, D^, and D 77 are not necessarily the same 
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20% 



50% 100% 



in Fig. [14| because of the block-diagonalization of matri- 
ces.) The subsequent columns show the mean distance 
matrices as increasing numbers of edges are rewired; for a 
given row, the node ordering in each column is fixed. The 
distance matrices for the randomizations are the mean 
pairwise distances between networks, where the mean is 
calculated over all possible pairs between 10 perturba- 
tions of each network. More precisely, let A and B rep- 
resent two different (unperturbed) networks and let the 
sequences A±, A<i, . . . , A\§ and B±, B2, . . . , -Bio represent 
10 realizations of the perturbation process (e.g., at the 
5% level) for the networks. To calculate the distance 
between A and B under perturbation, we find for each 
j G {1, . . . , 10} the distances between Aj and B\, Aj and 



B 2 



and Aj and Biq. We then calculate the mean 



of the ensuing 10 x 10 = 100 distance values. Based on 
visual inspection of Fig. [l4j the matrices for the first few 
columns for all of the distances are fairly similar to the 
original distance matrices. This suggests some notion of 
robustness in our clustering technique. We study only 25 
networks because of the computational costs of rewiring 
a large number of networks multiple times; however, we 
have performed the same investigation for 5 different sub- 
sets of 25 networks and obtained similar results. We list 
the networks in each subset of 25 in Table I in the Sup- 
plemental Material. 

To carry out a more thorough randomization of each 
network, we now rewire every edge in the network 10 

we show the D^, D 5 , and 
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times on average. In Fig. 
D 7 ? mean-distance matrices for this number of rewirings. 
We again calculate the mean distance using the method 
described in the previous paragraph. The first column 
again shows the distance matrices for the original net- 
works. The second and third columns show the distance 
matrices for randomizations in which the degree distribu- 
tion is preserved and destroyed, respectively. The node 
orderings of the matrices in the second and third columns 
are again the same as the orderings for the matrix of 
the first column of the corresponding row. The second 



column in Fig. 15 demonstrates that some block struc- 
ture remains in the distance matrices when the degree 
distribution is preserved. The third column shows that 
much of this structure is destroyed (though some block 
structure is still visible) when the degree distribution is 
not preserved. When the networks are "fully random- 
ized" in this way — with the only constraint being that 
each rewired network must consist of a single connected 
component — one is in effect producing random graphs. 
These random graphs might, however, still have some 
common properties, such as the number of nodes and 
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FIG. 14. (Color online) Block-diagonalized mean distance 
matrices D K (top row), D s (middle row), and D 77 (bottom 
row) for the 25 networks listed in bold in Table II of the Sup- 
plemental Material. The columns show the mean-distance 
matrices following randomizations of the original network in 
which a given percentage of edges are rewired and the degree 
distributions of the networks are preserved. (We also con- 
strain each rewired network to consist of a single connected 
component.) The ordering of the nodes in the plots is fixed 
for each row. The first column shows the distance matrix for 
the original networks. The distance matrices for the random- 
izations are the mean pairwise distances between networks. 



the edge density. 

Appendix B: Computational Heuristics 
1. Robustness of Network MRFs 

We detected all communities in the main text using 
the locally greedy Louvain algorithm [25]; however, sev- 
eral alternative heuristics exist, so we now investigate 
whether the choice of heuristic has any effect on the re- 
sults. In Ref. [24 , Good et al. demonstrated that there 
can be extreme near-degeneracies in the energy function, 
in particular an exponential number of low-energy (i.e., 
high- modularity) solutions. Given this, it is unsurpris- 
ing that different energy-optimization heuristics can yield 
very different partitions for the same network. Good et 
al. suggested that the reason for this behavior is that 
different heuristics sample different regions of the energy 
landscape. Because of the potential sensitivity of results 
to the choice of heuristic, one should treat individual par- 
titions by particular heuristics with caution. However, 
one can have more confidence in the validity of the parti- 
tions if different heuristics produce similar results. Here 
we compare the results for the Louvain algorithm [25] 
with those for a spectral algorithm [18] and simulated 
annealing [58] . 

In Fig.[l6] we show MRFs for three networks calculated 
using Louvain [25 , spectral [18] and simulated annealing 
algorithms [58]. For all three networks, the three algo- 
rithms agree very closely on the shapes of the 5, and 
77 MRFs. The MRFs are most similar for the roll-call vot- 
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FIG. 15. (Color online) Block-diagonalized distance matri- 
ces D n (top row), D s (middle row), and D 77 (bottom row) 
for the 25 networks listed in bold in Table II of the Supple- 
mental Material. The first column shows the distance matri- 
ces for the original networks. The second column shows the 
mean distance matrices following randomizations of the orig- 
inal networks in which 10 times the total number of edges in 
the networks have been rewired such that the degree distri- 
butions are preserved and the rewired networks each consist 
of a single connected component. The third column shows 
the mean distance matrices following randomizations of the 
original networks in which 10 times the total number of edges 
in the networks have been rewired but only the fact that the 
networks consist of single connected components is preserved 
(i.e., the degree distributions are not preserved). The dis- 
tance matrices for the randomizations are composed of the 
mean pairwise distances between the networks. 



ing network of the 102 nd US Senate [MSI], and the U 
MRF is almost identical for the three heuristics. In gen- 
eral, we observe the largest differences in the shapes of 
the MRFs when using the spectral algorithm. The spec- 
tral algorithm that we used begins by finding a partition 
of the network into exactly two components such that the 
energy is minimized (among all bipartitions). It then re- 
cursively partitions the smaller networks into two groups 
until no decrease in energy can be obtained through bi- 
partitioning. At each step, this algorithm only finds the 
optimal partition of each community into two smaller 
communities even though a split into more communities 
could yield a lower energy. Given this, it is unsurpris- 
ing that the spectral algorithm often identifies partitions 
further from the optimum than the other heuristics. For 
the remainder of this section, we therefore only compare 
the Louvain and simulated annealing algorithms. 
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FIG. 16. (Color online) Comparison of the MRFs produced 
using spectral 18 , Louvain [25], and simulated annealing 
|58| optimization heuristics. We show the MRFs for the (a) 
Zachary Karate Club network [83], (b) the roll-call voting 

network of the 102 nd US Senate [M52], and (c) the Garfield 
small- world citations network 1401. 



2. Robustness of Resulting Network Taxonomies 



Although Fig. [16] shows good agreement between the 
shapes of the MRFs that we obtain from the different 
computational heuristics, we nevertheless check that the 
small differences that do occur do not have a significant 
effect on the resulting network taxonomy. Because of the 
computational cost of detecting communities using sim- 
ulated annealing, we investigate the effect on the taxon- 
omy using a subset of small networks. We highlight all of 
the networks that we consider with an asterisk (*) in Ta- 
ble II of the Supplemental Material. (The largest network 
that we include is the cat brain cortical/thalmic network 
[TT] , which has 1,170 nodes.). Indeed, MRFs for small 
networks tend to be much noisier than those for large 
networks — see, for example, Fig. [l6^a), which shows the 
MRFs for the 34-node Zachary Karate Club network — 
so we expect that any differences between algorithms are 
likely to be more pronounced for small networks. 

In Fig. [17] we show dendrograms obtained using the 
Louvain and simulated-annealing modularity optimiza- 
tion algorithms for a subset of 15 networks. On visual 
inspection, the dendrograms appear to be very similar, 
as there are only a few small differences in the heights 
at which leaves and clusters combine. To quantify the 
similarity between a pair of dendrograms with underly- 
ing distance matrices s and t, we define a correlation 
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FIG. 17. (Color online) Comparison of the dendrograms pro- 
duced using a Louvain algorithm (top panel) and simulated 
annealing (bottom panel) for a subset of 15 networks. The 
only difference between the two dendrograms is the order in 
which the "Communication within a sawmill on strike" and 
the "BA: (100,2)" networks cluster and the distances at which 
the other networks cluster. 



coefficient cp as 



(Bl) 



where s is the mean of the distances Sij and t is the mean 
of the distances tij. Dendrograms derived from identi- 
cal distance matrices have correlation coefficient ip = 1. 
The correlation for the example dendrograms shown in 



Fig. [TTjis 0.997. To get a better sense of the extent of this 
correlation, we compare the observed correlations with 
those obtained for randomized dendrograms. To make 
the comparison, we first produce a distribution of corre- 
lation coefficients cp between a large number of empirical 
(unrandomized) dendrograms produced by the Louvain 
and simulated-annealing algorithms. Because of the com- 
putational costs of calculating the MRFs for the simu- 
lated annealing algorithm, we only consider the subset of 
25 networks identified above. We select 15 networks uni- 
formly at random from this subset of 25 networks and 



generate two dendrograms similar to those in Fig. [lT| 
one corresponds to the distance matrix produced by the 
Louvain algorithm and the other corresponds to the dis- 
tance matrix produced by simulated annealing. We then 
calculate the correlation coefficient between the two dis- 
tance matrices. We repeat this process 10,000 times to 
obtain 10,000 correlation coefficients, whose distribution 
we show using the hollow red histogram in Fig. 18 This 



procedure makes it possible to compare a large number 
of dendrograms at the computational cost of calculating 
simulated annealing MRFs for a total of 25 networks, 
highlighted with asterisks in Table 2 of the Supplemen- 
tal Material. 

We then compare this observed distribution of correla- 
tion coefficients to a randomized reference. We focus on 
the correlation between empirical Louvain dendrograms 
(i.e., empirical dendrograms resulting from distance ma- 
trices produced by the Louvain method) and random- 
ized simulated-annealing dendrograms (i.e., dendrograms 
resulting from distance matrices produced by the simu- 
lated annealing algorithm that have been subsequently 
randomized). We proceed as follows: for each of the 
10,000 dendrogram pairs that we assembled from sub- 
sets of 15 networks, we create 100 randomizations of the 
simulated-annealing dendrogram, and we then calculate 
the correlation coefficient between each of these random- 
ized dendrograms and the corresponding empirical Lou- 
vain dendrogram. The resulting distribution from 10,000 



repetitions is the solid blue histogram in Fig. 18 To ran- 
domize the simulated-annealing dendrogram, we used the 
double-permutation procedure described in Refs. [6Q| l6Tj . 
This procedure has two steps. First, we randomize the 
distances at which the different clusters are combined. 
For example, consider an unrandomized dendrogram in 
which clusters A and B are combined at a distance of 0.45 
and clusters C and D are combined at a distance of 0.65; 
after the randomization, A and B might be combined at 
a distance of 0.65 and C and D might be combined at 
a distance of 0.45. Second, we randomize the networks 
corresponding to each leaf in the dendrogram. This two- 
step randomization procedure maintains the underlying 
distances and the topology of the dendrogram. 

As mentioned above, we show the distributions of cor- 
relation coefficients between empirical Louvain dendro- 
grams and the empirical (unrandomized) and random- 
ized simulated-annealing dendrograms in Fig. 18 The 



correlation is clearly much higher for the empirical case, 
as there is only a very slight overlap in the tails of the two 
distributions. The correlation between the Louvain and 
simulated- annealing dendrograms is greater than 0.99 for 
about 63% of the studied dendrograms. 



Appendix C: Diagnostic for Assessing the Clustering 
from Different Distance Measures 



An examination of the leaf colors of the dendrogram 
in Fig. [7] illustrates that the employed distance measure 
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FIG. 18. (Color online) Comparison of the distributions of 
correlation coefficients between empirical Louvain dendro- 
grams and empirical (red, hollow) and randomized (blue, 
solid) simulated-annealing dendrograms. See the text for de- 
tails. 



groups together networks from a variety of categories, 
including political voting networks, political committee 
networks, Facebook networks, metabolic networks, and 
fungal networks. A visual comparison provides a rea- 
sonable starting point for assessing the effectiveness of 
different distance measures at clustering networks. To 
quantify how effectively each distance matrix (D^, D^, 
D 77 , and D p ) clusters networks of the same type, we in- 
troduce a clustering diagnostic, which we denote by a(h), 
to be explained shortly. Because the assignment of net- 
works to categories is subjective and because some of 
the categories include networks of very different types, it 
would be inappropriate to assess the effectiveness of a dis- 
tance measure based on how well it clusters networks in 
very broad categories. We thus focus our examination on 
narrower categories whose constituent networks are clus- 
tered fairly tightly in Fig. [7j This includes the following 
8 categories of networks: Facebook, metabolic, politi- 
cal cosponsorship, political committee, political voting, 
financial, brain, and fungal. 

The clustering diagnostic depends on where one "cuts" 
the dendrograms. We start by constructing a dendro- 
gram for each of the four distance matrices D^, D 5 , 
D 77 , and D p . Performing a horizontal cut through a den- 
drogram at a given height h splits the dendrogram into 
multiple disconnected clusters (h is measured in terms of 
ultrametric distances; see Fig.[l7|). For each such cluster, 
we calculate the proportion of networks from a particular 
category that are contained in it. For example, if a cut 
produces three clusters and if we consider the Facebook 
category, then we might find that one cluster contains 
two tenths of the Facebook networks, a second cluster has 
three tenths of those networks, and the third cluster has 
the remaining half of those networks. We calculate these 
membership fractions for each network category and for 
each cluster. We then identify, for each category, what we 
called the plurality cluster, which is defined as the cluster 
that includes the largest fraction of networks from that 
category. In the above example, the third cluster is the 
plurality cluster for the Facebook category. Our diagnos- 
tic a(h) is then defined by adding across all 8 categories 



the fraction of networks in the plurality clusters: 



J = l 



(CI) 



where jj(h) is the plurality fraction for the jth category 
of networks for the given cut at height h of the dendo- 
gram. 

We perform similar calculations for each level of the 
dendrogram and use the resulting values of a(h) to as- 
sess the effectiveness of the different distance measures at 
clustering the networks. For example, at the root of the 
dendrogram, all of the networks are in a single cluster, 
so the maximum fraction of networks in the same cluster 
is 1 for every network category. Given the above choice 
of 8 categories, this yields a = 8. However, as one con- 
siders lower levels of the dendrogram, the clusters break 
up more and more, so the fraction of networks in the 
plurality cluster in each category typically decreases. Ef- 
fective distances measures ought to result in relatively 
high values for a(h). 



In Fig. 19 , we compare the values of a(h) at each level 
of the dendrogram for D^, D 5 , D 77 , and D p . For each of 
the different subsets of networks and for most of the den- 
drogram levels, the PCA-distance D p is the most effective 
of the employed distance measures at clustering networks 
of the same category. This agrees with our visual assess- 
ment (i.e., our identification of contiguous blocks of color) 
of the different measures. 



Appendix D: Using Simple Characteristics to 
Cluster Networks 



We established in Section [Villi that the PCA-distances 
D p between MRFs can produce sensible network tax- 
onomies, and we now consider briefly whether the ob- 
served taxonomies can be explained using simple sum- 
mary statistics. We consider only a few specific prop- 
erties, though of course there are myriad other network 
diagnostics that one might consider. 

Perhaps the three simplest properties of an undirected 
network are the following: (1) whether it has weighted 
or unweighted edges; (2) the number of nodes N; and (3) 
the edge density d = 2L/[N(N — 1)] (where L is the num- 
ber of edges, which we distintinguish from the total edge 
weight m in weighted networks). The top colored row in 
Fig. [20] indicates that many of the weighted networks are 
clustered together at the far left of the dendrogram. How- 
ever, there are also weighted networks scattered through- 
out the dendrogram, so whether a network is weighted or 
unweighted does not explain the observed classification. 
The third colored row provides a clearer explanation for 
the cluster of networks at the left: These are not sim- 
ply weighted networks, as they are in fact similarity net- 
works, so that nearly all possible edges are present and 
have weights indicating connection strengths. However, 
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FIG. 19. (Color online) Comparison of the effectiveness of 
the employed distance measures at clustering networks of the 
same category. As discussed in this text, we quantify this 
using the clustering diagnostic a(h). We calculate dendro- 
grams from four distance matrices (D H , D s , D 77 , and D p ) 
and compare the resulting values of a(h) for different sets of 
categories, (a) The value of the clustering diagnostic a(h) 
as a function of dendrogram cut level h (i.e., where the den- 
drogram is split to clusters) for the following 8 categories of 
networks: Facebook, metabolic, political cosponsorship, po- 
litical committee, political voting, financial, brain, and fungal, 
(b) The value of a(h) for the largest 5 of the above 8 cate- 
gories (Facebook, metabolic, political cosponsorship, political 
committee, and political voting) and (c) for the smallest 5 of 
the above 8 categories (Facebook, metabolic, financial, brain, 
and fungal). The maximum possible value of a(h) in each 
panel is equal to the number of categories considered in each 
panel. The values of a(h) obtained using the PCA-distance 
matrix D p (gray solid curve) are usually higher than those ob- 
tained using the other three distance measures. This suggests 
that PCA distance is the most effective of the four employed 
clustering measures. 



FIG. 20. (Color online) Taxonomy for 189 networks. We 
constructed the dendrogram using the distance matrix D p 
and average linkage clustering. We order the leaves of the 
dendrogram to minimize the distance between adjacent nodes, 
and we color the leaves to indicate the type of network. The 
three color bars below the dendrogram indicate whether the 
network corresponding to each leaf is weighted or unweighted 
(top), the number of nodes in the networks N (middle), and 
the fraction of possible edges that are present d (bottom). 



this property alone cannot explain the observed classi- 
fication, as several of the weighted networks containing 
nearly all possible edges do not appear at the far left 
of the dendrogram. In fact, there are many clusters in 
the dendrogram that contain networks with very different 
fractions of possible edges. The total number of nodes, 
shown by the second colored row in the figure, again ex- 
plains some of the clustering, as networks with similar 
numbers of nodes are clustered together in some regions 
of the dendrogram. However, there are also numerous 
examples in which networks with the same number of 
nodes appear in different clusters. Therefore, none of 
these three simple network diagnostics can explain the 
observed classification by itself. 
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SUPPLEMENTAL MATERIAL 

Taxonomy Of All Studied Networks 

In Supplemental Fig. [I] we show a dendrogram con- 
taining leaves for all of the 746 networks that we stud- 
ied. This dendrogram contains several large contiguous 
blocks of leaves that correspond to networks belonging to 
the same category. For example, there are large contigu- 
ous blocks of fungal, Facebook, metabolic, political com- 
mittee, political voting, and financial networks. These 
blocks do not always include all of the networks within a 
category; when there are separate contiguous blocks for 
the same category, the blocks sometimes correspond to 
different types of networks within a category. For ex- 
ample, the political voting networks category includes 
separate blocks of UN voting networks and UK House 
of Commons voting networks. However, because of the 
number of networks that we include in the study and the 
imbalance in the spread of networks across categories, 
Fig. [I] is difficult to interpret and the smaller categories 
are obfuscated by the larger ones. Therefore, for a clearer 
view of the relationships between the different categories 
of networks, we focus in the main text on 189 of the 746 
networks. 



Details of Networks 

In Supplemental Table II, we provide details of all of 
the networks that we employed in our study. The set of 
networks includes several several synthetic network mod- 
els as well as synthetic benchmark networks that were in- 
troduced to test community detection algorithms. We in- 
clude multiple realizations (using various parameter val- 
ues) for many of the model and benchmark networks. In 
this section, we briefly describe these synthetic networks 
and explain the notation that we use to label them in 
Supplemental Table II. 



1. Erdos-Renyi (ER) 

In an ER network of N nodes, each pair of nodes is con- 
nected by an unweighted edge with probability p (and is 
not connected with probability 1 — p) [1 . The degree of 
each node is distributed according to a binomial distribu- 
tion. We label the ER networks using the notation "ER: 



2. Watts-Strogatz (WS) 

We consider the small-world network of Watts and 
Strogatz [2] for a one-dimensional lattice of N nodes with 
periodic boundary conditions. The network consists of a 
ring in which each node is connected with an unweighted 



edge to all of its neighbors that are k or fewer lattice 
spacings away. Each edge is then considered in turn and 
one end is rewired with probability p to a different node 
selected uniformly at random, subject to the constraint 
that there can be no self-edges or multi-edges. We label 
each Watts-Strogatz network as "WS: (iV,fc,p)". 



3. Barabasi-Albert (BA) 

BA networks [3] are obtained using a network growth 
mechanism in which nodes with degree m are added to 
the network, one per time step, and the other end of each 
new edge attaches to an existing node with a probability 
proportional to the degree of that node. We label each 
BA network as "BA: (7V,m)". 



4. Fractal 

We generate fractal networks using the method de- 
scribed in Ref. [4 . We begin by generating an isolated 
group of 2^ fully connected nodes, where f3 gives the size 
of the clusters. These groups correspond to the hierarchi- 
cal level h = 0. We then create a second identical group 
and connect the two groups using an edge density of / e _1 , 
where f e is the number of edges out of all possible edges 
between the groups. We then duplicate this network and 
connect the two duplicates at the level h — 2 using an 
edge density of f~ 2 . We repeat this until we reach the 
desired network size N = 2 n , where n is the number of hi- 
erarchical levels. At each step, the connection density is 
decreased, resulting in progressively sparser interconnec- 
tivity at higher hierarchical levels. The resulting network 
exhibits self-similar properties. We label each network 
"Fractal: (ra,/3,/ e )". 



5. Random Fully-Connected 

We produce randomly weighted, fully connected net- 
works of N nodes by connecting every node to every 
other node with an edge whose weight is chosen uni- 
formly at random on the unit interval. The networks 
have N(N — l)/2 edges. We label each network "Ran- 
dom fully-connected: (AT)". 



6. Kumpula-Onnela-Saramaki-Kaski-Kertesz 
(KOSKK) model 

We generate weighted networks containing communi- 
ties using the model described in Ref. [5]. We create 
edges via two mechanisms. First, at every time-step, each 
node i selects a neighbor j with probability Wij/si, where 
Wij is the weight of of the edge connecting i and j and 
Si = J2j w ij 1S the strength of i. If j has other neighbors 
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SM FIG. 1. (Color online) Dendrogram for 746 networks obtained using mesoscopic response functions (MRFs). Note that the 
colors used to indicate network categories are not the same as those in the main text. 



in addition to z, then one of them is selected with proba- 
bility Wjk/(sj — Wij). If i and k are not connected, then 
a new edge of weight Wik — wo is created with probabil- 
ity p n . If the edge already exists, its weight is increased 
by an amount S. In both cases, and Wjk are also 
increased by 5. This process is termed local attachment. 
Second, if a node has no edges, then with probability p r 
it creates an edge of weight wq to a randomly selected 
node. (This is called global attachment.) A node can be 
deleted with probability pd', if this happens, then all of 
its edges are also removed and the node is replaced by a 
new node, so that the total number of nodes N and the 
mean number of edges both remain constant. We label 
each network "Weighted: (TV, wq, S,p n ,p r ,pd,t)" , where 
t is the total number of simulation time steps. 



7. Lancichinetti-Fortunato-Radicchi (LFR) 
benchmark 

The LFR benchmark [6J consists of unweighted net- 
works with non-overlapping communities. A network in 
this ensemble is constructed by assigning each node a 
degree from a power-law distribution with exponent 7, 
where the extremes of the distribution k m { n and & max are 
chosen so that the mean degree is (fc), and the nodes are 
connected using the configuration model [7]. Each node 
shares a fraction /i of its edges with nodes in other com- 
munities and a fraction 1 — /jl of them with nodes in its 
own community. The community sizes are taken from a 
power-law distribution with exponent /?, subject to the 



constraint that the sum of all of the community sizes 
equals the number of nodes TV in the network. The min- 
imum and maximum community sizes (q m [ n and g m ax) 
are then chosen to satisfy the additional constraint that 
q m i n > /c m in and g max > /c max , which ensures that each 
node is included in at least one community. We label 
each network "LFR: (JV, (k), fc max , 7, /3, /i, g min , g max )". 



8. Lancichinetti-Fortunato (LF) benchmark 

The LF benchmark [8 allows networks to be weighted 
and the communities to overlap. In the present paper, 
we only consider weighted networks with non-overlapping 
communities. The node degrees are again taken from a 
power-law degree distribution (as in LFR benchmark net- 
works), but this time we label the exponent ti, and the 
community sizes are taken from a power-law degree dis- 
tribution with exponent T2. The strength Si of each node 
is chosen so that = fcf , where ki again gives the de- 
gree of node i. There are also two mixing parameters: 
a topological mixing parameter which specifies the 
proportion of edges outside a node's community; and a 
participation mixing parameter /jl w: which specifies the 
weight of a node's edges outside its community. We label 
each network "LF: (JV, (k), & max , fi w , ft, n, r 2 )" . For 
all of the LF networks, we set N = 1000. One can alter- 
natively set the minimum and maximum community sizes 
<? min and <? max . We always use <? min 20 and <? max = 50, 
so we do not include these parameters when we label the 
networks. 
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9. LF- Newman- Gir van benchmark 

We include an LF network ensemble with parameters 
values N 128, (k) 16, & max 16, fi w = 0.1, q min 
32, 

<7max — 32, and /3 — 1. This family of networks is 
similar to the NG benchmark [SI [9] . 

SM TABLE I. The networks included in the 6 subsets of 25 
networks used to test the robustness of the clusterings to ran- 
dom perturbations in Appendix A of the main text. The 
network ID corresponds to the numerical identifier of the net- 
work in Supplemental Table II. 



Subset 


Network IDs 


1 

2 
3 
4 
5 
6 


9, 11, 243, 251, 267, 269, 280, 283, 301, 305, 340, 351, 353, 646, 662, 665, 674, 688, 690, 693, 700, 711, 735, 736, 740 
71, 250, 251, 252, 264, 267, 270, 271, 276, 285, 301, 303, 305, 347, 354, 646, 649, 669, 674, 683, 688, 691, 737, 738, 739 
11, 12, 243, 254, 264, 266, 268, 273, 284, 290, 291, 302, 308, 340, 341, 354, 355, 645, 649, 662, 672, 690, 695, 697, 717 
252, 253, 258, 261, 264, 268, 274, 279, 280, 286, 288, 291, 342, 347, 348, 352, 641, 669, 694, 696, 699, 700, 711, 714, 737 

10, 20, 34, 248, 250, 261, 266, 268, 272, 283, 285, 301, 306, 308, 344, 656, 661, 666, 693, 694, 700, 709, 711, 712, 715 
34, 249, 252, 253, 256, 257, 266, 277, 283, 289, 291, 301, 302, 340, 341, 345, 650, 656, 661, 662, 690, 709, 713, 717, 736 



SM-4 



SM TABLE II. Network summary statistics. We symmetrize all networks, remove self-edges, and only consider largest connected 
components. In this table, we give the network category, whether it is weighted or unweighted, the number of nodes N in 
the largest connected component, the number of edges L in this component, the fraction of possible edges present f e = 
2L/[N(N — 1)], and a reference providing details of the data source. We highlight the 25 networks used in the randomizations 
in Appendix A in bold and the 189 networks used in the aggregate taxonomy in red. We indicate with an asterisk (*) all 
networks used in Appendix B.2 to test the robustness of the taxonomy to different optimization heuristics. 



ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


1 


Human brain cortex: participant Al 


Brain 


Y 


994 


13,520 


0.0274 


ma 


2 


Human brain cortex: participant A2 


Brain 


Y 


987 


14,865 


0.0305 


ma 


3 


Human brain cortex: participant B 


Brain 


Y 


980 


14,222 


0.0296 


m 


4 


Human brain cortex: participant D 


Brain 


Y 


996 


14,851 


0.0300 


na 


5 


Human brain cortex: participant E 


Brain 


Y 


992 


14,372 


0.0292 


ma 


6 


Human brain cortex: participant C 


Brain 


Y 


996 


14,933 


0.0301 




7 


Cat brain: 


cortical* 


Brain 


Y 


52 


515 


0.3884 


eh 


8 


Cat brain: cortical/thalmic* 


Brain 


Y 


95 


1,170 


0.2620 


EH 


9 


Macaque brain: cortical* 


Brain 


N 


47 


313 


0.2895 


m 


10 


Macaque brain: visual/sensory cortex* 


Brain 


N 


71 


438 


0.1763 


m 


11 


Macaque brain: visual cortex 1* 


Brain 


N 


30 


190 


0.4368 


na 


12 


Macaque brain: visual cortex 2* 


Brain 


N 


32 


194 


0.3911 


m 


13 


Coauthorship 


astrophysics 


Collaboration 


Y 


14,845 


119,652 


0.0011 


M 


14 


Coauthorship 


comp. geometry 


Collaboration 


Y 


3,621 


9,461 


0.0014 


Hum] 


15 


Coauthorship 


condensed matter 


Collaboration 


Y 


13,861 


44,619 


0.0005 


nn 


16 


Coauthorship 


Erdos 


Collaboration 


N 


6,927 


11,850 


0.0005 


na 


17 


Coauthorship 


high-energy theory 


Collaboration 


Y 


5,835 


13,815 


0.0008 




18 


Coauthorship 


network science 


Collaboration 


Y 


379 


914 


0.0128 


na 


19 


Hollywood film music* 


Collaboration 


Y 


39 


219 


0.2955 


m 


20 


Jazz collaboration 


Collaboration 


N 


198 


2,742 


0.1406 


m 


21 


Facebook: 


American 


Facebook 


N 


6,370 


217,654 


0.0107 


EH 


22 


Facebook: 


Amherst 


Facebook 


N 


2,235 


90,954 


0.0364 


eh 


23 


Facebook: 


Auburn 


Facebook 


N 


18,448 


973,918 


0.0057 


EH 


24 


Facebook: 


Baylor 


Facebook 


N 


12,799 


679,815 


0.0083 


EH 


25 


Facebook: 


BC 


Facebook 


N 


11,498 


486,961 


0.0074 


eh 


26 


Facebook: 


Berkeley 


Facebook 


N 


22,900 


852,419 


0.0033 


EH 


27 


Facebook: 


Bingham 


Facebook 


N 


10,001 


362,892 


0.0073 


EH 


28 


Facebook: 


Bowdoin 


Facebook 


N 


2,250 


84,386 


0.0334 


EH 


29 


Facebook: 


Brandeis 


Facebook 


N 


3,887 


137,561 


0.0182 


EH 


30 


Facebook: 


Brown 


Facebook 


N 


8,586 


384,519 


0.0104 


EH 


31 


Facebook: 


BU 


Facebook 


N 


19,666 


637,509 


0.0033 


EH 


32 


Facebook: 


Bucknell 


Facebook 


N 


3,824 


158,863 


0.0217 


EH 


33 


Facebook: 


Cal 


Facebook 


N 


11,243 


351,356 


0.0056 


EH 


34 


Facebook: 


Caltech 


Facebook 


N 


762 


16,651 


0.0574 


EH 


35 


Facebook: 


Carnegie 


Facebook 


N 


6,621 


249,959 


0.0114 


EH 


36 


Facebook: 


Colgate 


Facebook 


N 


3,482 


155,043 


0.0256 


EH 


37 


Facebook: 


Columbia 


Facebook 


N 


11,706 


444,295 


0.0065 


EH 


38 


Facebook: 


Cornell 


Facebook 


N 


18,621 


790,753 


0.0046 


EH 


39 


Facebook: 


Dartmouth 


Facebook 


N 


7,677 


304,065 


0.0103 


EH 


40 


Facebook: 


Duke 


Facebook 


N 


9,885 


506,437 


0.0104 


EH 


41 


Facebook: 


Emory 


Facebook 


N 


7,449 


330,008 


0.0119 


EH 


42 


Facebook: 


FSU 


Facebook 


N 


27,731 


1,034,799 


0.0027 


EH 


43 


Facebook: 


Georgetown 


Facebook 


N 


9,388 


425,619 


0.0097 


EH 


44 


Facebook: 


GWU 


Facebook 


N 


12,164 


469,511 


0.0063 


EH 


45 


Facebook: 


Hamilton 


Facebook 


N 


2,312 


96,393 


0.0361 


EH 


46 


Facebook: 


Harvard 


Facebook 


N 


15,086 


824,595 


0.0072 


EH 


47 


Facebook: 


Haverford 


Facebook 


N 


1,446 


59,589 


0.0570 


EH 


48 


Facebook: 


Howard 


Facebook 


N 


4,047 


204,850 


0.0250 


EH 


49 


Facebook: 


Indiana 


Facebook 


N 


29,732 


1,305,757 


0.0030 


EH 


50 


Facebook: 


JMU 


Facebook 


N 


14,070 


485,564 


0.0049 


EH 
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SM TABLE II. (Continued.) 



ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


51 


Facebook 


Johns 


Facebook 


N 


5,157 


186,572 


0.0140 


EU 


52 


Facebook 


Lehigh 


Facebook 


N 


5,073 


198,346 


0.0154 


eh 


53 


Facebook 


Maine 


Facebook 


N 


9,065 


243,245 


0.0059 


EH 


54 


Facebook 


Maryland 


Facebook 


N 


20,829 


744,832 


0.0034 


ED 


55 


Facebook 


Mich 


Facebook 


N 


3,745 


81,901 


0.0117 


EH 


56 


Facebook 


Michigan 


Facebook 


N 


30,106 


1,176,489 


0.0026 


EH 


57 


Facebook 


Middlebury 


Facebook 


N 


3,069 


124,607 


0.0265 


EH 


58 


Facebook 


Mississippi 


Facebook 


N 


10,519 


610,910 


0.0110 


EH 


59 


Facebook 


MIT 


Facebook 


N 


6,402 


251,230 


0.0123 


EH 


60 


Facebook 


MSU 


Facebook 


N 


32,361 


1,118,767 


0.0021 


EH 


61 


Facebook 


MU 


Facebook 


N 


15,425 


649,441 


0.0055 


EH 


62 


Facebook 


Northeastern 


Facebook 


N 


13,868 


381,919 


0.0040 


EH 


63 


Facebook 


Northwestern 


Facebook 


N 


10,537 


488,318 


0.0088 


EH 


64 


Facebook 


Notre Dame 


Facebook 


N 


12,149 


541,336 


0.0073 


EH 


65 


Facebook 


NYU 


Facebook 


Y 


21,623 


715,673 


0.0031 


EH 


66 


Facebook 


Oberlin 


Facebook 


N 


2,920 


89,912 


0.0211 


EH 


67 


Facebook 


Oklahoma 


Facebook 


N 


17,420 


892,524 


0.0059 


EH 


68 


Facebook 


Penn 


Facebook 


N 


41,536 


1,362,220 


0.0016 


EH 


69 


Facebook 


Pepperdine 


Facebook 


N 


3,440 


152,003 


0.0257 


EH 


70 


Facebook 


Princeton 


Facebook 


N 


6,575 


293,307 


0.0136 


EH 


71 


Facebook 


Reed 


Facebook 


N 


962 


18,812 


0.0407 


EH 


72 


Facebook 


Rice 


Facebook 


N 


4,083 


184,826 


0.0222 


EH 


73 


Facebook 


Rochester 


Facebook 


N 


4,561 


161,403 


0.0155 


EH 


74 


Facebook 


Rutgers 


Facebook 


N 


24,568 


784,596 


0.0026 


EH 


75 


Facebook 


Santa 


Facebook 


N 


3,578 


151,747 


0.0237 


EH 


76 


Facebook 


Simmons 


Facebook 


N 


1,510 


32,984 


0.0290 


EH 


77 


Facebook 


Smith 


Facebook 


N 


2,970 


97,133 


0.0220 


EH 


78 


Facebook 


Stanford 


Facebook 


N 


11,586 


568,309 


0.0085 


EH 


79 


Facebook 


Swarthmore 


Facebook 


N 


1,657 


61,049 


0.0445 


EH 


80 


Facebook 


Syracuse 


Facebook 


N 


13,640 


543,975 


0.0058 


EH 


81 


Facebook 


Temple 


Facebook 


N 


13,653 


360,774 


0.0039 


EH 


82 


Facebook 


Tennessee 


Facebook 


N 


16,977 


770,658 


0.0053 


EH 


83 


Facebook 


Texas80 


Facebook 


N 


31,538 


1,219,639 


0.0025 


EH 


84 


Facebook 


Texas84 


Facebook 


N 


36,364 


1,590,651 


0.0024 


EH 


85 


Facebook 


Trinity 


Facebook 


N 


2,613 


111,996 


0.0328 


EH 


86 


Facebook 


Tufts 


Facebook 


N 


6,672 


249,722 


0.0112 


EH 


87 


Facebook 


Tulane 


Facebook 


N 


7,740 


283,912 


0.0095 


EH 


88 


Facebook 


U. Chicago 


Facebook 


N 


6,561 


208,088 


0.0097 


EH 


89 


Facebook 


U. Conn. 


Facebook 


N 


17,206 


604,867 


0.0041 


EH 


90 


Facebook 


U. Illinois 


Facebook 


N 


30,795 


1,264,421 


0.0027 


EH 


91 


Facebook 


U. Mass. 


Facebook 


N 


16,502 


519,376 


0.0038 


EH 


92 


Facebook 


U. Penn. 


Facebook 


N 


14,888 


686,485 


0.0062 


EH 


93 


Facebook 


UC33 


Facebook 


N 


16,800 


522,141 


0.0037 


EH 


94 


Facebook 


UC61 


Facebook 


N 


13,736 


442,169 


0.0047 


EH 


95 


Facebook 


UC64 


Facebook 


N 


6,810 


155,320 


0.0067 


EH 


96 


Facebook 


UCF 


Facebook 


N 


14,936 


428,987 


0.0038 


EH 


97 


Facebook 


UCLA 


Facebook 


N 


20,453 


747,604 


0.0036 


EH 


98 


Facebook 


UCSB 


Facebook 


N 


14,917 


482,215 


0.0043 


EH 


99 


Facebook 


UCSC 


Facebook 


N 


8,979 


224,578 


0.0056 


EH 


100 


Facebook 


UCSD 


Facebook 


N 


14,936 


443,215 


0.0040 


EH 
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ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


101 


Facebook: 


UF 


Facebook 


N 


35,111 


1,465,654 


0.0024 


EU 


102 


Facebook: 


UGA 


Facebook 


N 


24,380 


1,174,051 


0.0040 


eh 


103 


Facebook: 


UNC 


Facebook 


N 


18,158 


766,796 


0.0047 


EH 


104 


Facebook: 


use 


Facebook 


N 


17,440 


801,851 


0.0053 


ED 


105 


Facebook: 


USF 


Facebook 


N 


13,367 


321,209 


0.0036 


EH 


106 


Facebook: 


USFCA 


Facebook 


N 


2,672 


65,244 


0.0183 


EH 


107 


Facebook: 


UVA 


Facebook 


N 


17,178 


789,308 


0.0054 


EH 


108 


Facebook: 


Vanderbilt 


Facebook 


N 


8,063 


427,829 


0.0132 


EH 


109 


Facebook: 


Vassar 


Facebook 


N 


3,068 


119,161 


0.0253 


EH 


110 


Facebook: 


Vermont 


Facebook 


N 


7,322 


191,220 


0.0071 


EH 


111 


Facebook: 


Villanova 


Facebook 


N 


7,755 


314,980 


0.0105 


EH 


112 


Facebook: 


Virginia 


Facebook 


N 


21,319 


698,175 


0.0031 


EH 


113 


Facebook: 


Wake 


Facebook 


N 


5,366 


279,186 


0.0194 


EH 


114 


Facebook: 


Wash. U. 


Facebook 


N 


7,730 


367,526 


0.0123 


EH 


115 


Facebook: 


Wellesley 


Facebook 


N 


2,970 


94,899 


0.0215 


EH 


116 


Facebook: 


Wesleyan 


Facebook 


N 


3,591 


138,034 


0.0214 


EH 


117 


Facebook: 


William 


Facebook 


N 


6,472 


266,378 


0.0127 


EH 


118 


Facebook: 


Williams 


Facebook 


N 


2,788 


112,985 


0.0291 


EH 


119 


Facebook: 


Wisconsin 


Facebook 


N 


23,831 


835,946 


0.0029 


EH 


120 


Facebook: 


Yale 


Facebook 


N 


8,561 


405,440 


0.0111 


EH 


121 


NYSE: 


1980-1999 


Financial 


Y 


477 


113,526 


1.0000 


EH 


122 


NYSE: 


1980-1983 


Financial 


Y 


477 


113,526 


1.0000 


EH 


123 


NYSE: 


1984-1987 


Financial 


Y 


477 


113,526 


1.0000 


E2 


124 


NYSE: 


1988-1991 


Financial 


Y 


477 


113,526 


1.0000 


E2 


125 


NYSE: 


1992-1995 


Financial 


Y 


477 


113,526 


1.0000 


EH 


126 


NYSE: 


1996-1999 


Financial 


Y 


477 


113,526 


1.0000 


E2 


127 


NYSE: 


HI 


1985 


Financial 


Y 


100 


4,950 


1.0000 


EH1 


128 


NYSE: 


H2 


1985 


Financial 


Y 


100 


4,950 


1.0000 


EH 


129 


NYSE: 


HI 


1986 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


130 


NYSE: 


H2 


1986 


Financial 


Y 


100 


4,950 


1.0000 


EH] 


131 


NYSE: 


HI 


1987 


Financial 


Y 


100 


4,950 


1.0000 


EH] 


132 


NYSE: 


H2 


1987 


Financial 


Y 


100 


4,950 


1.0000 


EH] 


133 


NYSE: 


HI 


1988 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


134 


NYSE: 


H2 


1988 


Financial 


Y 


100 


4,950 


1.0000 


EH] 


135 


NYSE: 


HI 


1989 


Financial 


Y 


100 


4,950 


1.0000 


EH] 


136 


NYSE: 


H2 


1989 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


137 


NYSE: 


HI 


1990 


Financial 


Y 


100 


4,950 


1.0000 


EH] 


138 


NYSE: 


H2 


1990 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


139 


NYSE: 


HI 


1991 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


140 


NYSE: 


H2 


1991 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


141 


NYSE: 


HI 


1992 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


142 


NYSE: 


H2 


1992 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


143 


NYSE: 


HI 


1993 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


144 


NYSE: 


H2 


1993 


Financial 


Y 


100 


4,950 


1.0000 


EH] 


145 


NYSE: 


HI 


1994 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


146 


NYSE: 


H2 


1994 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


147 


NYSE: 


HI 


1995 


Financial 


Y 


100 


4,950 


1.0000 


EH] 


148 


NYSE: 


H2 


1995 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


149 


NYSE: 


HI 


1996 


Financial 


Y 


100 


4,950 


1.0000 


EHI 


150 


NYSE: 


H2 


1996 


Financial 


Y 


100 


4,950 


1.0000 


EH] 
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ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


151 


NYSE: 


HI 


1997 




Financial 


Y 


100 


4,950 


1.0000 


[23] 


152 


NYSE: 


H2 


1997 




Financial 


Y 


100 


4,950 


1.0000 


EH 


153 


NYSE: 


HI 


1998 




Financial 


Y 


100 


4,950 


1.0000 


m 


154 


NYSE: 


H2 


1998 




Financial 


Y 


100 


4,950 


1.0000 


m 


155 


NYSE: 


HI 


1999 




Financial 


Y 


100 


4,950 


1.0000 


m 


156 


NYSE: 


H2 


1999 




Financial 


Y 


100 


4,950 


1.0000 


m 


157 


NYSE: 


HI 


2000 




Financial 


Y 


100 


4,950 


1.0000 


m 


158 


NYSE: 


H2 


2000 




Financial 


Y 


100 


4,950 


1.0000 


m 


159 


NYSE: 


HI 


2001 




Financial 


Y 


100 


4,950 


1.0000 


m 


160 


NYSE: 


H2 


2001 




Financial 


Y 


100 


4,950 


1.0000 


m 


161 


NYSE: 


HI 


2002 




Financial 


Y 


100 


4,950 


1.0000 


m 


162 


NYSE: 


H2 


2002 




Financial 


Y 


100 


4,950 


1.0000 


m 


163 


NYSE: 


HI 


2003 




Financial 


Y 


100 


4,950 


1.0000 


m 


164 


NYSE: 


H2 


2003 




Financial 


Y 


100 


4,950 


1.0000 


m 


165 


NYSE: 


HI 


2004 




Financial 


Y 


100 


4,950 


1.0000 


m 


166 


NYSE: 


H2 


2004 




Financial 


Y 


100 


4,950 


1.0000 


m 


167 


NYSE: 


HI 


2005 




Financial 


Y 


100 


4,950 


1.0000 


m\ 


168 


NYSE: 


H2 


2005 




Financial 


Y 


100 


4,950 


1.0000 


m 


169 


NYSE: 


HI 


2006 




Financial 


Y 


100 


4,950 


1.0000 


m 


170 


NYSE: 


H2 


2006 




Financial 


Y 


100 


4,950 


1.0000 


m 


171 


NYSE: 


HI 


2007 




Financial 


Y 


100 


4,950 


1.0000 


m 


172 


NYSE: 


H2 


2008 




Financial 


Y 


100 


4,950 


1.0000 


m 


173 


NYSE: 


HI 


2008 




Financial 


Y 


100 


4,950 


1.0000 


m 


174 


NYSE: 


H2 


2000 




Financial 


Y 


100 


4,950 


1.0000 


m 


175 


Physarum polycephalum 0126-bm06-wt-k2-l 


Fungal 


Y 


411 


645 


0.0077 


m 


176 


Physarum polycephalum 0149-bm05-wt-k2-l 


Fungal 


Y 


345 


548 


0.0092 


E2 


177 


Physarum polycephalum 0157-bm03-wt-k2-l 


Fungal 


Y 


251 


399 


0.0127 


M 


178 


Physarum polycephalum 0166-bm03-wt-k2-l 


Fungal 


Y 


492 


778 


0.0064 


EH 


179 


Physarum polycephalum 0181-bm02-wt-k2-l 


Fungal 


Y 


192 


307 


0.0167 


EH 


180 


Physarum polycephalum 0185-bm02-wt-k2-l 


Fungal 


Y 


199 


311 


0.0158 


EH 


181 


Agrocybe g 


ibberosa 


AG-1 


Fungal 


Y 


2366 


3665 


0.0013 


EH 


182 


Phallus impudicus 


PI113-1 


Fungal 


Y 


543 


725 


0.0049 


EH 


183 


Phallus impudicus 


PI120-1 


Fungal 


Y 


483 


559 


0.0048 


EH 


184 


Phallus impudicus 


PI37-1 


Fungal 


Y 


644 


826 


0.0040 


EH 


185 


Phallus impudicus 


PI40-1 


Fungal 


Y 


550 


748 


0.0050 


[25] 


186 


Phallus impudicus 


PI-1 


Fungal 


Y 


1,357 


1,858 


0.0020 


EH 


187 


Resinicium bicolor RB3ctl-3 


Fungal 


Y 


202 


233 


0.0115 


EH 


188 


Resinicium bicolor RB4ctl-3 


Fungal 


Y 


426 


545 


0.0060 


EH 


189 


Resinicium bicolor RB7ctl-3 


Fungal 


Y 


380 


458 


0.0064 


EH 


190 


Strophularia caerulea SC-1 


Fungal 


Y 


536 


689 


0.0048 


[25] 


191 


Phanerochaete velutina controlll-1 


Fungal 


Y 


65 


71 


0.0341 


[26] 


192 


Phanerochaete velutina controlll-2 


Fungal 


Y 


117 


136 


0.0200 


m\ 


193 


Phanerochaete velutina controlll-3 


Fungal 


Y 


240 


273 


0.0095 


m 


194 


Phanerochaete velutina controlll-4 


Fungal 


Y 


403 


458 


0.0057 


m 


195 


Phanerochaete velutina controlll-5 


Fungal 


Y 


526 


588 


0.0043 


m 


196 


Phanerochaete velutina controlll-6 


Fungal 


Y 


591 


661 


0.0038 


m 


197 


Phanerochaete velutina controlll-7 


Fungal 


Y 


690 


772 


0.0032 


m 


198 


Phanerochaete velutina controlll-8 


Fungal 


Y 


721 


821 


0.0032 


m 


199 


Phanerochaete velutina controlll-9 


Fungal 


Y 


772 


884 


0.0030 


m 


200 


Phanerochaete velutina controlll-10 


Fungal 


Y 


789 


907 


0.0029 


[26] 
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ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


201 


Phanerochaete 


velutina 


controlll-11* 


Fun£ 


^al 




Y 


823 


954 


0.0028 


H2 


202 


Phanerochaete 


velutina 


control 17-1 


Fun£ 


^al 




Y 


16 


15 


0.1250 


m 


203 


Phanerochaete 


velutina 


controll7-2 


Fun£ 


^al 




Y 


232 


240 


0.0090 


m 


204 


Phanerochaete 


velutina 


control 17-3 


Fun£ 


^al 




Y 


502 


539 


0.0043 


m 


205 


Phanerochaete 


velutina 


controll7-4 


Fun£ 


^al 




Y 


703 


754 


0.0031 




206 


Phanerochaete 


velutina 


control 17-5 


Fun£ 


^al 




Y 


816 


874 


0.0026 


m 


207 


Phanerochaete 


velutina 


control 17-6 


Fun£ 


^al 




Y 


950 


1,058 


0.0023 


m 


208 


Phanerochaete 


velutina 


control 17- 7 


Fun£ 


^al 




Y 


1,047 


1,182 


0.0022 


m 


209 


Phanerochaete 


velutina 


controll7-8 


Fun£ 


^al 




Y 


1,113 


1,303 


0.0021 


m 


210 


Phanerochaete 


velutina 


controll7-9 


Fun£ 


^al 




Y 


1,142 


1,347 


0.0021 


m 


211 


Phanerochaete 


velutina 


controll7-10 


Fun£ 


^al 




Y 


1,160 


1,384 


0.0021 


m 


212 


Phanerochaete 


velutina 


controll7-ll 


Fun£ 


^al 




Y 


1,205 


1,469 


0.0020 


m\ 


213 


Phanerochaete 


velutina 


control4-l 


Fun£ 


^al 




Y 


200 


213 


0.0107 


m 


214 


Phanerochaete 


velutina 


control4-2 


Fun£ 


^al 




Y 


461 


490 


0.0046 


m 


215 


Phanerochaete 


velutina 


control4-3 


Fun£ 


^al 




Y 


826 


862 


0.0025 




216 


Phanerochaete 


velutina 


control4-4 


Fun£ 


^al 




Y 


1,044 


1,087 


0.0020 


m 


217 


Phanerochaete 


velutina 


control4-5 


Fun£ 


ral 




Y 


1,380 


1,476 


0.0016 


m 


218 


Phanerochaete 


velutina 


control4-6 


Fun£ 


^al 




Y 


1,623 


1,767 


0.0013 


m 


219 


Phanerochaete 


velutina 


control4-7 


Fun£ 


^al 




Y 


1,756 


1,923 


0.0012 


m 


220 


Phanerochaete 


velutina 


control4-8 


Fun£ 


^al 




Y 


1,869 


2,061 


0.0012 


m 


221 


Phanerochaete 


velutina 


control4-9 


Fun£ 


^al 




Y 


1,992 


2,196 


0.0011 


m 


222 


Phanerochaete 


velutina 


control4-10 


Fun£ 


$al 




Y 


2,086 


2,301 


0.0011 


m\ 


223 


Phanerochaete 


velutina 


control4-ll 


Fun£ 


$al 




Y 


2,190 


2,431 


0.0010 


m 


224 


Phallus impudicus pil50ctl-l 


Fun£ 


ral 




Y 


1,810 


2,537 


0.0015 


EH 


225 


Phanerochaete 


velutina 


pv81-l 


Fun£ 


ral 




Y 


75 


82 


0.0295 


m 


226 


Phanerochaete 


velutina 


pv81-2 


Fun£ 


$al 




Y 


653 


897 


0.0042 


G3 


227 


Phanerochaete 


velutina 


pv81-3 


Fun£ 


^al 




Y 


911 


1,255 


0.0030 


[27] 


228 


Phanerochaete 


velutina 


pv81-4 


Fun£ 


ral 




Y 


1,064 


1,467 


0.0026 


[27] 


229 


Phanerochaete 


velutina 


pv81-5 


Fungal 




Y 


986 


1,351 


0.0028 


[27] 


230 


Phanerochaete 


velutina 


pv82-l 


Fun£ 


$al 




Y 


111 


112 


0.0183 


m 


231 


Phanerochaete 


velutina 


pv82-2 


Fun£ 


ral 




Y 


467 


523 


0.0048 


m 


232 


Phanerochaete 


velutina 


pv82-3 


Fun£ 


$al 




Y 


630 


726 


0.0037 


m 


233 


Phanerochaete 


velutina 


pv82-4 


Fun£ 


$al 




Y 


644 


749 


0.0036 


G3 


234 


Phanerochaete 


velutina 


pv82-5 


Fun£ 


ral 




Y 


551 


627 


0.0041 


EH 


235 


Phanerochaete 


velutina 


pv83-l 


Fun£ 


ral 




Y 


129 


142 


0.0172 


m 


236 


Phanerochaete 


velutina 


pv83-2 


Fun£ 


$al 




Y 


424 


510 


0.0057 


G3 


237 


Phanerochaete 


velutina 


pv83-3 


Fun£ 


ral 




Y 


671 


857 


0.0038 


[27] 


238 


Phanerochaete 


velutina 


pv83-4 


Fun£ 


ral 




Y 


708 


905 


0.0036 


mi 


239 


Phanerochaete 


velutina 


pv83-5 


Fun£ 


$al 




Y 


551 


668 


0.0044 


EH 


240 


Online Dictionary of Computing 


Lan£ 


^ua£ 


*e 


Y 


13,356 


91,471 


0.0010 


M 


241 


Online Dictionary Of Information Science 


Lan£ 


^ua£ 


re 


Y 


2,898 


16,376 


0.0039 


[El [29] 


242 


Reuters 9/11 news 




Lan£ 


;ua£ 


re 


Y 


13,308 


148,035 


0.0017 


[30] 


243 


Roget's thesaurus 




Langua^ 


re 


N 


994 


3,640 


0.0074 


[HE] 


244 


Word adjacency: English 


Lan£ 


^ua£ 


re 


N 


7,377 


44,205 


0.0016 


[32] 


245 


Word adjacency: French 


Lan£ 


^ua£ 


;e 


N 


8,308 


23,832 


0.0007 


[32] 


246 


Word adjacency: Japanese 


Lan£ 


^ua£ 


re 


N 


2,698 


7,995 


0.0022 


[32] 


247 


Word adjacency: Spanish 


Lan£ 


^ua£ 


re 


N 


11,558 


43,050 


0.0006 


[32] 


248 


Metabolic: A A 




Metabolic 


N 


411 


1,818 


0.0216 


[33] 


249 


Metabolic: AB 






Metabolic 


N 


386 


1,691 


0.0228 


[33] 


250 


Metabolic: AG 






Metabolic 


N 


494 


2,173 


0.0178 


[33] 
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N 


L 


fe 


References 


251 


Metabolic: AP 


Metabolic 


N 


201 


857 


0.0426 


[33] 


252 


Metabolic: 


AT 


Metabolic 


N 


296 


1,231 


0.0282 


EH 


253 


Metabolic: 


BB 


Metabolic 


N 


175 


628 


0.0412 


[33] 


254 


Metabolic: 


BS 


Metabolic 


N 


772 


3,611 


0.0121 


[33] 


255 


Metabolic: 


CA 


Metabolic 


N 


483 


2,274 


0.0195 


[33] 


256 


Metabolic: 


CE 


Metabolic 


N 


453 


2,025 


0.0198 


[33] 


257 


Metabolic: 


CJ 


Metabolic 


N 


370 


1,631 


0.0239 


[33] 


258 


Metabolic: 


CL 


Metabolic 


N 


382 


1,646 


0.0226 


[33] 


259 


Metabolic: 


CQ 


Metabolic 


N 


187 


663 


0.0381 


[33] 


260 


Metabolic: 


CT 


Metabolic 


N 


211 


772 


0.0348 


[33] 


261 


Metabolic: 


CY 


Metabolic 


N 


537 


2,503 


0.0174 


[33] 


262 


Metabolic: 


DR 


Metabolic 


N 


800 


3,789 


0.0119 


[33] 


263 


Metabolic: 


EC 


Metabolic 


N 


762 


3,683 


0.0127 


[33] 


264 


Metabolic: 


EF 


Metabolic 


N 


375 


1,721 


0.0245 


[33] 


265 


Metabolic: 


EN 


Metabolic 


N 


374 


1,617 


0.0232 


[33] 


266 


Metabolic: 


HI 


Metabolic 


N 


505 


2,325 


0.0183 


[33] 


267 


Metabolic: HP 


Metabolic 


N 


365 


1,703 


0.0256 


[33] 


268 


Metabolic: 


MB 


Metabolic 


N 


418 


1,850 


0.0212 


[33] 


269 


Metabolic: MG 


Metabolic 


N 


199 


783 


0.0397 


[33] 


270 


Metabolic: 


MJ 


Metabolic 


N 


422 


1,874 


0.0211 


[33] 


271 


Metabolic: 


ML 


Metabolic 


N 


414 


1,862 


0.0218 


[33] 


272 


Metabolic: 


MP 


Metabolic 


N 


171 


685 


0.0471 


[33] 


273 


Metabolic: 


MT 


Metabolic 


N 


577 


2,653 


0.0160 


[33] 


274 


Metabolic: 


NG 


Metabolic 


N 


394 


1,824 


0.0236 


[33] 


275 


Metabolic: 


NM 


Metabolic 


N 


369 


1,708 


0.0252 


[33] 


276 


Metabolic: 


OS 


Metabolic 


N 


285 


1,168 


0.0289 


[33] 


277 


Metabolic: 


PA 


Metabolic 


N 


720 


3,429 


0.0132 


[33] 


278 


Metabolic: 


PF 


Metabolic 


N 


310 


1,379 


0.0288 


[33] 


279 


Metabolic: 


PG 


Metabolic 


N 


412 


1,772 


0.0209 


[33] 


280 


Metabolic: PH 


Metabolic 


N 


318 


1,394 


0.0277 


[33] 


281 


Metabolic: 


PN 


Metabolic 


N 


405 


1,829 


0.0224 


[33] 


282 


Metabolic: 


RC 


Metabolic 


N 


663 


3,111 


0.0142 


[33] 


283 


Metabolic: RP 


Metabolic 


N 


203 


775 


0.0378 


[33] 


284 


Metabolic: 


sc 


Metabolic 


N 


552 


2,595 


0.0171 


[33] 


285 


Metabolic: 


ST 


Metabolic 


N 


391 


1,756 


0.0230 


[33] 


286 


Metabolic: 


TH 


Metabolic 


N 


427 


1,955 


0.0215 


[33] 


287 


Metabolic: 


TM 


Metabolic 


N 


328 


1,452 


0.0271 


[33] 


288 


Metabolic: 


TP 


Metabolic 


N 


194 


788 


0.0421 


[33] 


289 


Metabolic: 


TY 


Metabolic 


N 


803 


3,863 


0.0120 


[33] 


290 


Metabolic: 


YP 


Metabolic 


N 


552 


2,471 


0.0162 


[33] 


291 


US political books co-purchase* 


Other 


N 


105 


441 


0.0808 


M 


292 


Power grid 




Other 


N 


4,941 


6,594 


0.0005 


m 


293 


Slovenian magazine co-purchase 


Other 


Y 


124 


5,972 


0.7831 


m 


294 


Transcription: E. coli 


Other 


N 


328 


456 


0.0085 


m 


295 


Transcription: Yeast 


Other 


N 


662 


1,062 


0.0049 


m 


296 


US airlines 




Other 


Y 


324 


2,081 


0.0398 


SUE] 


297 


2008 NCAA football schedule* 


Other 


Y 


121 


764 


0.1052 


[38] 


298 


Internet: autonomous systems 


Other 


N 


22,963 


48,436 


0.0002 


[39] 


299 


Garfield: scientometrics citations 


Other 


Y 


2,678 


10,368 


0.0029 


go] 


300 


Garfield: Small and Griffith citations 


Other 


Y 


1,024 


4,916 


0.0094 


M 
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ID 
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N 


L 


fe 


References 


301 


Garfield: small-world citations 


Other 




N 


233 


994 


0.0368 


na 


302 


Electronic circuit (s208)* 


Other 




N 


122 


189 


0.0256 


[32] 


303 


Electronic circuit (s420) 


Other 




N 


252 


399 


0.0126 


[32] 


304 


Electronic circuit (s838) 


Other 




N 


512 


819 


0.0063 


[32] 


305 


Protein: serine protease inhibitor (1EAW)* 


Other 




N 


53 


123 


0.0893 


[32] 


306 


Protein: immunoglobulin (1A4J)* 


Other 




N 


95 


213 


0.0477 


[32] 


307 


Protein: oxidoreductase (1AOR)* 


Other 




N 


97 


212 


0.0455 


[32] 


308 


AIDS blogs* 


Other 




N 


146 


180 


0.0170 


SB 


309 


Political blogs 


Other 




Y 


1,222 


16,714 


0.0224 


[42] 


310 


WWW (Stanford) 


Other 




N 


8,929 


26,320 


0.0007 


m 


311 


Trade product proximity 


Other 




Y 


775 


283,094 


0.9439 


[44] 


312 


World trade in metal (1994): Net 


Other 




Y 


80 


875 


0.2769 


nans] 


313 


World trade in metal (1994): Total 


Other 




Y 


80 


875 


0.2769 


sass] 


314 


Bill cosponsorship: US House 96 


Political 


cosponsorship 


Y 


438 


95,529 


0.9982 


[46] [47] 


315 


Bill cosponsorship: US House 97 


Political 


cosponsorship 


Y 


435 


94,374 


0.9998 


SSI S3 


316 


Bill cosponsorship: US House 98 


Political 


cosponsorship 


Y 


437 


95,256 


0.9999 


HO S3 


317 


Bill cosponsorship: US House 99 


Political 


cosponsorship 


Y 


437 


94,999 


0.9972 


[46] [47] 


318 


Bill cosponsorship: US House 100 


Political 


cosponsorship 


Y 


439 


96,125 


0.9998 


[46] [47] 


319 


Bill cosponsorship: US House 101 


Political 


cosponsorship 


Y 


437 


95,263 


1.0000 


HO [47] 


320 


Bill cosponsorship: US House 102 


Political 


cosponsorship 


Y 


437 


95,051 


0.9977 


HO S3 


321 


Bill cosponsorship: US House 103 


Political 


cosponsorship 


Y 


437 


95,028 


0.9975 


[46] [47] 


322 


Bill cosponsorship: US House 104 


Political 


cosponsorship 


Y 


439 


95,925 


0.9978 


HO [47] 


323 


Bill cosponsorship: US House 105 


Political 


cosponsorship 


Y 


442 


97,373 


0.9991 


HO [47] 


324 


Bill cosponsorship: US House 106 


Political 


cosponsorship 


Y 


436 


94,820 


0.9999 


[46] [47] 


325 


Bill cosponsorship: US House 107 


Political 


cosponsorship 


Y 


442 


97,233 


0.9977 


HO S3 


326 


Bill cosponsorship: US House 108 


Political 


cosponsorship 


Y 


439 


96,104 


0.9996 


HO [47] 


327 


Bill cosponsorship: US Senate 96 


Political 


cosponsorship 


Y 


101 


5,050 


1.0000 


[46] [47] 


328 


Bill cosponsorship: US Senate 97 


Political 


cosponsorship 


Y 


101 


5,050 


1.0000 


HO [47] 


329 


Bill cosponsorship: US Senate 98 


Political 


cosponsorship 


Y 


101 


5,050 


1.0000 


HO S3 


330 


Bill cosponsorship: US Senate 99 


Political 


cosponsorship 


Y 


101 


5,049 


0.9998 


HOE] 


331 


Bill cosponsorship: US Senate 100 


Political 


cosponsorship 


Y 


101 


5,050 


1.0000 


[46] [47] 


332 


Bill cosponsorship: US Senate 101 


Political 


cosponsorship 


Y 


100 


4,950 


1.0000 


HO S3 


333 


Bill cosponsorship: US Senate 102 


Political 


cosponsorship 


Y 


102 


5,142 


0.9983 


[46] [47] 


334 


Bill cosponsorship: US Senate 103 


Political 


cosponsorship 


Y 


101 


5,050 


1.0000 


HO S3 


335 


Bill cosponsorship: US Senate 104 


Political 


cosponsorship 


Y 


102 


5,151 


1.0000 


Si S3 


336 


Bill cosponsorship: US Senate 105 


Political 


cosponsorship 


Y 


100 


4,950 


1.0000 


Si S3 


337 


Bill cosponsorship: US Senate 106 


Political 


cosponsorship 


Y 


102 


5,151 


1.0000 


Si S3 


338 


Bill cosponsorship: US Senate 107 


Political 


cosponsorship 


Y 


101 


5,049 


0.9998 


Si S3 


339 


Bill cosponsorship: US Senate 108 


Political 


cosponsorship 


Y 


100 


4,950 


1.0000 


Si S3 


340 


Committees: US House 101, comms. 


Political 


committee 


N 


159 


3,610 


0.2874 


SUES 


341 


Committees: US House 102, comms. 


Political 


committee 


N 


163 


4,093 


0.3100 


SUSS] 


342 


Committees: US House 103, comms. 


Political 


committee 


N 


141 


2,983 


0.3022 


SiSS] 


343 


Committees: US House 104, comms. 


Political 


committee 


N 


106 


1,839 


0.3305 


Si S3 


344 


Committees: US House 105, comms. 


Political 


committee 


N 


108 


1,997 


0.3456 


SUBS] 


345 


Committees: US House 106, comms. 


Political 


committee 


N 


107 


2,031 


0.3581 


Si S3 


346 


Committees: US House 107, comms. 


Political 


committee 


N 


113 


2,429 


0.3838 


SiSS] 


347 


Committees: US House 108, comms. 


Political 


committee 


N 


118 


2,905 


0.4208 


HUBS] 


348 


Committees: US House 101, Reps. 


Political 


committee 


N 


434 


18,714 


0.1992 


Basis 


349 


Committees: US House 102, Reps. 


Political 


committee 


N 


436 


20,134 


0.2123 


Bass] 


350 


Committees: US House 103, Reps. 


Political 


committee 


N 


437 


18,212 


0.1912 


SiSS] 
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ID 
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N 


L 


fe 
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351 


Committees: US House 104, Reps. 


Political 


committee 


N 


432 


17,130 


0.1840 


BH1H9] 


352 


Committees: US House 105, Reps. 


Political 


committee 


N 


435 


18,297 


0.1938 


noma 


353 


Committees: US House 106, Reps. 


Political 


committee 


N 


435 


18,832 


0.1995 


BUSS] 


354 


Committees: US House 107, Reps. 


Political 


committee 


N 


434 


19,824 


0.2110 


BH1H9] 


355 


Committees: US House 108, Reps. 


Political 


committee 


N 


437 


21,214 


0.2227 


sasg 


356 


Roll call 


US House 1 


Political 


voting 


Y 


66 


2,122 


0.9893 


[Soma 


357 


Roll call 


US House 2 


Political 


voting 


Y 


71 


2,428 


0.9771 


[50H52] 


358 


Roll call 


US House 3 


Political 


voting 


Y 


108 


5,669 


0.9811 


[50H52] 


359 


Roll call 


US House 4 


Political 


voting 


Y 


114 


6,342 


0.9846 


[Soma 


360 


Roll call 


US House 5 


Political 


voting 


Y 


117 


6,600 


0.9726 


[50H52] 


361 


Roll call 


US House 6 


Political 


voting 


Y 


113 


6,222 


0.9832 


[50H52] 


362 


Roll call 


US House 7 


Political 


voting 


Y 


110 


5,921 


0.9877 


[50H52] 


363 


Roll call 


US House 8 


Political 


voting 


Y 


149 


10,888 


0.9875 


[Soma 


364 


Roll call 


US House 9 


Political 


voting 


Y 


147 


10,582 


0.9861 


[50H52] 


365 


Roll call 


US House 10 


Political 


voting 


Y 


149 


10,857 


0.9847 


[50H52] 


366 


Roll call 


US House 11 


Political 


voting 


Y 


153 


11,482 


0.9874 


[Soma 


367 


Roll call 


US House 12 


Political 


voting 


Y 


146 


10,535 


0.9953 


[50H52] 


368 


Roll call 


US House 13 


Political 


voting 


Y 


195 


18,723 


0.9898 


[50H52] 


369 


Roll call 


US House 14 


Political 


voting 


Y 


195 


18,540 


0.9802 


[Soma 


370 


Roll call 


US House 15 


Political 


voting 


Y 


195 


18,666 


0.9868 


[50H52] 


371 


Roll call 


US House 16 


Political 


voting 


Y 


197 


19,118 


0.9903 


[50H52] 


372 


Roll call 


US House 17 


Political 


voting 


Y 


199 


19,429 


0.9862 


[50H52] 


373 


Roll call 


US House 18 


Political 


voting 


Y 


221 


23,812 


0.9795 


[Soma 


374 


Roll call 


US House 19 


Political 


voting 


Y 


220 


23,993 


0.9960 


[50H52] 


375 


Roll call 


US House 20 


Political 


voting 


Y 


219 


23,666 


0.9914 


nam 


376 


Roll call 


US House 21 


Political 


voting 


Y 


220 


23,985 


0.9956 


[Soma 


377 


Roll call 


US House 22 


Political 


voting 


Y 


217 


23,404 


0.9986 


EMS] 


378 


Roll call 


US House 23 


Political 


voting 


Y 


257 


32,502 


0.9880 


[50H52] 


379 


Roll call 


US House 24 


Political 


voting 


Y 


255 


32,062 


0.9900 


[SoHSa 


380 


Roll call 


US House 25 


Political 


voting 


Y 


256 


32,366 


0.9916 


[50H52] 


381 


Roll call 


US House 26 


Political 


voting 


Y 


255 


32,067 


0.9902 


[50H52] 


382 


Roll call 


US House 27 


Political 


voting 


Y 


257 


32,743 


0.9953 


[50H52] 


383 


Roll call 


US House 28 


Political 


voting 


Y 


234 


26,788 


0.9826 


[Soma 


384 


Roll call 


US House 29 


Political 


voting 


Y 


236 


27,562 


0.9939 


[50H52] 


385 


Roll call 


US House 30 


Political 


voting 


Y 


236 


27,669 


0.9978 


[50H52] 


386 


Roll call 


US House 31 


Political 


voting 


Y 


241 


28,804 


0.9960 


[5QHS2] 


387 


Roll call 


US House 32 


Political 


voting 


Y 


239 


28,318 


0.9957 


[50H52] 


388 


Roll call 


US House 33 


Political 


voting 


Y 


240 


28,570 


0.9962 


[50H52] 


389 


Roll call 


US House 34 


Political 


voting 


Y 


236 


27,545 


0.9933 


EDH52] 


390 


Roll call 


US House 35 


Political 


voting 


Y 


245 


29,630 


0.9913 


[50H52] 


391 


Roll call 


US House 36 


Political 


voting 


Y 


243 


29,312 


0.9969 


[50H52] 


392 


Roll call 


US House 37 


Political 


voting 


Y 


197 


18,735 


0.9704 


[50H52] 


393 


Roll call 


US House 38 


Political 


voting 


Y 


187 


17,326 


0.9963 


noma 


394 


Roll call 


US House 39 


Political 


voting 


Y 


199 


19,593 


0.9945 


[50H52] 


395 


Roll call 


US House 40 


Political 


voting 


Y 


233 


26,605 


0.9843 


[50H52] 


396 


Roll call 


US House 41 


Political 


voting 


Y 


256 


32,109 


0.9837 


noma 


397 


Roll call 


US House 42 


Political 


voting 


Y 


253 


31,626 


0.9921 


[50H52] 


398 


Roll call 


US House 43 


Political 


voting 


Y 


302 


45,151 


0.9934 


[50H52] 


399 


Roll call 


US House 44 


Political 


voting 


Y 


308 


46,723 


0.9883 


[SoHSa 


400 


Roll call 


US House 45 


Political 


voting 


Y 


302 


45,315 


0.9970 


[50H52] 
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401 


Roll 


call 


US 


House 46 


Political 


voting 


Y 


301 


44,987 


0.9964 


[50H52] 


402 


Roll 


call 


US 


House 47 


Political 


voting 


Y 


306 


46,214 


0.9903 


[50H52] 


403 


Roll 


call 


us 


House 48 


Political 


voting 


Y 


338 


56,484 


0.9918 


[50H52] 


404 


Roll 


call 


us 


House 49 


Political 


voting 


Y 


330 


54,160 


0.9977 


[HH52] 


405 


Roll 


call 


us 


House 50 


Political 


voting 


Y 


326 


52,907 


0.9987 


[50H52] 


406 


Roll 


call 


us 


House 51 


Political 


voting 


Y 


347 


59,303 


0.9879 


[Soma 


407 


Roll 


call 


us 


House 52 


Political 


voting 


Y 


340 


57,285 


0.9940 


[50H52] 


408 


Roll 


call 


us 


House 53 


Political 


voting 


Y 


376 


69,943 


0.9921 


[50H52] 


409 


Roll 


call 


us 


House 54 


Political 


voting 


Y 


368 


67,085 


0.9934 


[Soma 


410 


Roll 


call 


us 


House 55 


Political 


voting 


Y 


371 


68,270 


0.9947 


[50H52] 


411 


Roll 


call 


us 


House 56 


Political 


voting 


Y 


369 


67,059 


0.9877 


[50H52] 


412 


Roll 


call 


us 


House 57 


Political 


voting 


Y 


371 


67,383 


0.9818 


[50H52] 


413 


Roll 


call 


us 


House 58 


Political 


voting 


Y 


397 


75,891 


0.9655 


[Soma 


414 


Roll 


call 


us 


House 59 


Political 


voting 


Y 


397 


76,299 


0.9707 


[50H52] 


415 


Roll 


call 


us 


House 60 


Political 


voting 


Y 


398 


77,921 


0.9863 


[50H52] 


416 


Roll 


call 


us 


House 61 


Political 


voting 


Y 


402 


80,174 


0.9947 


[Soma 


417 


Roll 


call 


us 


House 62 


Political 


voting 


Y 


408 


82,442 


0.9929 


[50H52] 


418 


Roll 


call 


us 


House 63 


Political 


voting 


Y 


452 


101,498 


0.9958 


[50H52] 


419 


Roll 


call 


us 


House 64 


Political 


voting 


Y 


441 


96,780 


0.9975 


[Soma 


420 


Roll 


call 


us 


House 65 


Political 


voting 


Y 


454 


102,108 


0.9930 


[50H52] 


421 


Roll 


call 


us 


House 66 


Political 


voting 


Y 


453 


101,199 


0.9885 


[50H52] 


422 


Roll 


call 


us 


House 67 


Political 


voting 


Y 


452 


101,482 


0.9956 


[50H52] 


423 


Roll 


call 


us 


House 68 


Political 


voting 


Y 


442 


96,885 


0.9941 


[Soma 


424 


Roll 


call 


us 


House 69 


Political 


voting 


Y 


437 


95,226 


0.9996 


[50H52] 


425 


Roll 


call 


us 


House 70 


Political 


voting 


Y 


443 


97,497 


0.9959 


nam 


426 


Roll 


call 


us 


House 71 


Political 


voting 


Y 


455 


102,502 


0.9924 


[Soma 


427 


Roll 


call 


us 


House 72 


Political 


voting 


Y 


447 


99,028 


0.9934 


EMS] 


428 


Roll 


call 


us 


House 73 


Political 


voting 


Y 


445 


98,647 


0.9986 


[50H52] 


429 


Roll 


call 


us 


House 74 


Political 


voting 


Y 


440 


96,170 


0.9958 


[SoHSa 


430 


Roll 


call 


us 


House 75 


Political 


voting 


Y 


445 


98,474 


0.9968 


[50H52] 


431 


Roll 


call 


us 


House 76 


Political 


voting 


Y 


456 


102,495 


0.9880 


[50H52] 


432 


Roll 


call 


us 


House 77 


Political 


voting 


Y 


450 


99,956 


0.9894 


[50H52] 


433 


Roll 


call 


us 


House 78 


Political 


voting 


Y 


450 


100,513 


0.9949 


[Soma 


434 


Roll 


call 


us 


House 79 


Political 


voting 


Y 


448 


99,246 


0.9912 


[50H52] 


435 


Roll 


call 


us 


House 80 


Political 


voting 


Y 


448 


99,902 


0.9977 


[50H52] 


436 


Roll 


call 


us 


House 81 


Political 


voting 


Y 


444 


98,054 


0.9970 


[Soma 


437 


Roll 


call 


us 


House 82 


Political 


voting 


Y 


447 


99,281 


0.9960 


[50H52] 


438 


Roll 


call 


us 


House 83 


Political 


voting 


Y 


440 


96,506 


0.9992 


[50H52] 


439 


Roll 


call 


us 


House 84 


Political 


voting 


Y 


437 


95,253 


0.9999 


EDH52] 


440 


Roll 


call 


us 


House 85 


Political 


voting 


Y 


444 


97,955 


0.9960 


[50H52] 


441 


Roll 


call 


us 


House 86 


Political 


voting 


Y 


443 


97,377 


0.9946 


[50H52] 


442 


Roll 


call 


us 


House 87 


Political 


voting 


Y 


449 


99,774 


0.9920 


[50H52] 


443 


Roll 


call 


us 


House 88 


Political 


voting 


Y 


443 


97,842 


0.9994 


[Soma 


444 


Roll 


call 


us 


House 89 


Political 


voting 


Y 


442 


97,139 


0.9967 


[50H52] 


445 


Roll 


call 


us 


House 90 


Political 


voting 


Y 


437 


95,251 


0.9998 


[50H52] 


446 


Roll 


call 


us 


House 91 


Political 


voting 


Y 


448 


99,815 


0.9969 


noma 


447 


Roll 


call 


us 


House 92 


Political 


voting 


Y 


443 


97,579 


0.9967 


[50H52] 


448 


Roll 


call 


us 


House 93 


Political 


voting 


Y 


443 


97,848 


0.9994 


[50H52] 


449 


Roll 


call 


us 


House 94 


Political 


voting 


Y 


441 


96,837 


0.9981 


[SoHSa 


450 


Roll 


call 


us 


House 95 


Political 


voting 


Y 


441 


96,493 


0.9946 


[50H52] 



SM-13 



SM TABLE II. (Continued.) 



ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


451 


Roll 


call 


US 


House 96 


Political 


voting 


Y 


440 


96,379 


0.9979 


[50H52] 


452 


Roll 


call 


US 


House 97 


Political 


voting 


Y 


442 


96,761 


0.9928 


[50H52] 


453 


Roll 


call 


us 


House 98 


Political 


voting 


Y 


439 


95,922 


0.9977 


[50H52] 


454 


Roll 


call 


us 


House 99 


Political 


voting 


Y 


439 


95,875 


0.9972 


[HH52] 


455 


Roll 


call 


us 


House 100 


Political 


voting 


Y 


440 


96,544 


0.9996 


[50H52] 


456 


Roll 


call 


us 


House 101 


Political 


voting 


Y 


440 


96,505 


0.9992 


[Soma 


457 


Roll 


call 


us 


House 102 


Political 


voting 


Y 


441 


96,811 


0.9978 


[50H52] 


458 


Roll 


call 


us 


House 103 


Political 


voting 


Y 


441 


96,348 


0.9931 


[50H52] 


459 


Roll 


call 


us 


House 104 


Political 


voting 


Y 


445 


98,720 


0.9993 


[Soma 


460 


Roll 


call 


us 


House 105 


Political 


voting 


Y 


443 


97,841 


0.9994 


[50H52] 


461 


Roll 


call 


us 


House 106 


Political 


voting 


Y 


440 


96,557 


0.9998 


[50H52] 


462 


Roll 


call 


us 


House 107 


Political 


voting 


Y 


443 


97,816 


0.9991 


[50H52] 


463 


Roll 


call 


us 


House 108 


Political 


voting 


Y 


440 


96,561 


0.9998 


[Soma 


464 


Roll 


call 


us 


House 109 


Political 


voting 


Y 


440 


96,549 


0.9997 


[50H52] 


465 


Roll 


call 


us 


House 110 


Political 


voting 


Y 


448 


99,603 


0.9948 


[50H52] 


466 


Roll 


call 


us 


Senate 1 


Political 


voting 


Y 


29 


393 


0.9680 


[Soma 


467 


Roll 


call 


us 


Senate 2 


Political 


voting 


Y 


31 


449 


0.9656 


[50H52] 


468 


Roll 


call 


us 


Senate 3 


Political 


voting 


Y 


32 


472 


0.9516 


[50H52] 


469 


Roll 


call 


us 


Senate 4 


Political 


voting 


Y 


43 


760 


0.8416 


[Soma 


470 


Roll 


call 


us 


Senate 5 


Political 


voting 


Y 


44 


808 


0.8541 


[50H52] 


471 


Roll 


call 


us 


Senate 6 


Political 


voting 


Y 


37 


644 


0.9670 


[50H52] 


472 


Roll 


call 


us 


Senate 7 


Political 


voting 


Y 


35 


537 


0.9025 


[50H52] 


473 


Roll 


call 


us 


Senate 8 


Political 


voting 


Y 


44 


864 


0.9133 


[Soma 


474 


Roll 


call 


us 


Senate 9 


Political 


voting 


Y 


37 


645 


0.9685 


[50H52] 


475 


Roll 


call 


us 


Senate 10 


Political 


voting 


Y 


37 


660 


0.9910 


nam 


476 


Roll 


call 


us 


Senate 11 


Political 


voting 


Y 


44 


855 


0.9038 


[Soma 


477 


Roll 


call 


us 


Senate 12 


Political 


voting 


Y 


37 


663 


0.9955 


EMS] 


478 


Roll 


call 


us 


Senate 13 


Political 


voting 


Y 


46 


947 


0.9150 


[50H52] 


479 


Roll 


call 


us 


Senate 14 


Political 


voting 


Y 


44 


898 


0.9493 


[SoHSa 


480 


Roll 


call 


us 


Senate 15 


Political 


voting 


Y 


46 


977 


0.9440 


[50H52] 


481 


Roll 


call 


us 


Senate 16 


Political 


voting 


Y 


51 


1,249 


0.9796 


[50H52] 


482 


Roll 


call 


us 


Senate 17 


Political 


voting 


Y 


52 


1,294 


0.9759 


[50H52] 


483 


Roll 


call 


us 


Senate 18 


Political 


voting 


Y 


52 


1,304 


0.9834 


[Soma 


484 


Roll 


call 


us 


Senate 19 


Political 


voting 


Y 


59 


1,589 


0.9287 


[50H52] 


485 


Roll 


call 


us 


Senate 20 


Political 


voting 


Y 


53 


1,343 


0.9746 


[50H52] 


486 


Roll 


call 


us 


Senate 21 


Political 


voting 


Y 


54 


1,339 


0.9357 


[Soma 


487 


Roll 


call 


us 


Senate 22 


Political 


voting 


Y 


53 


1,348 


0.9782 


[50H52] 


488 


Roll 


call 


us 


Senate 23 


Political 


voting 


Y 


54 


1,378 


0.9630 


[50H52] 


489 


Roll 


call 


us 


Senate 24 


Political 


voting 


Y 


61 


1,732 


0.9464 


EDH52] 


490 


Roll 


call 


us 


Senate 25 


Political 


voting 


Y 


58 


1,627 


0.9843 


[50H52] 


491 


Roll 


call 


us 


Senate 26 


Political 


voting 


Y 


60 


1,689 


0.9542 


[50H52] 


492 


Roll 


call 


us 


Senate 27 


Political 


voting 


Y 


59 


1,662 


0.9714 


[50H52] 


493 


Roll 


call 


us 


Senate 28 


Political 


voting 


Y 


57 


1,575 


0.9868 


[Soma 


494 


Roll 


call 


us 


Senate 29 


Political 


voting 


Y 


63 


1,895 


0.9703 


[50H52] 


495 


Roll 


call 


us 


Senate 30 


Political 


voting 


Y 


72 


2,320 


0.9077 


[50H52] 


496 


Roll 


call 


us 


Senate 31 


Political 


voting 


Y 


70 


2,341 


0.9694 


noma 


497 


Roll 


call 


us 


Senate 32 


Political 


voting 


Y 


73 


2,511 


0.9555 


[50H52] 


498 


Roll 


call 


us 


Senate 33 


Political 


voting 


Y 


70 


2,308 


0.9557 


[50H52] 


499 


Roll 


call 


us 


Senate 34 


Political 


voting 


Y 


64 


2,002 


0.9931 


[SoHSa 


500 


Roll 


call 


us 


Senate 35 


Political 


voting 


Y 


73 


2,542 


0.9673 


[50H52] 
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ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


501 


Roll 


call 


US 


Senate 


36 


Political 


voting 


Y 


70 


2,370 


0.9814 


[50H52] 


502 


Roll 


call 


US 


Senate 


37 


Political 


voting 


Y 


70 


2,051 


0.8493 


[50H52] 


503 


Roll 


call 


us 


Senate 


38 


Political 


voting 


Y 


54 


1,402 


0.9797 


[50H52] 


504 


Roll 


call 


us 


Senate 


39 


Political 


voting 


Y 


59 


1,610 


0.9410 


[HH52] 


505 


Roll 


call 


us 


Senate 


40 


Political 


voting 


Y 


69 


2,274 


0.9693 


[50H52] 


506 


Roll 


call 


us 


Senate 


41 


Political 


voting 


Y 


80 


3,084 


0.9759 


[Soma 


507 


Roll 


call 


us 


Senate 


42 


Political 


voting 


Y 


75 


2,773 


0.9993 


[50H52] 


508 


Roll 


call 


us 


Senate 


43 


Political 


voting 


Y 


79 


3,041 


0.9870 


[50H52] 


509 


Roll 


call 


us 


Senate 


44 


Political 


voting 


Y 


82 


3,261 


0.9819 


[Soma 


510 


Roll 


call 


us 


Senate 


45 


Political 


voting 


Y 


82 


3,265 


0.9831 


[50H52] 


511 


Roll 


call 


us 


Senate 


46 


Political 


voting 


Y 


81 


3,219 


0.9935 


[50H52] 


512 


Roll 


call 


us 


Senate 


47 


Political 


voting 


Y 


83 


3,362 


0.9880 


[50H52] 


513 


Roll 


call 


us 


Senate 


48 


Political 


voting 


Y 


78 


2,998 


0.9983 


[Soma 


514 


Roll 


call 


us 


Senate 


49 


Political 


voting 


Y 


81 


3,210 


0.9907 


[50H52] 


515 


Roll 


call 


us 


Senate 


50 


Political 


voting 


Y 


76 


2,850 


1.0000 


[50H52] 


516 


Roll 


call 


us 


Senate 


51 


Political 


voting 


Y 


91 


3,998 


0.9763 


[Soma 


517 


Roll 


call 


us 


Senate 


52 


Political 


voting 


Y 


93 


4,249 


0.9932 


[50H52] 


518 


Roll 


call 


us 


Senate 


53 


Political 


voting 


Y 


95 


4,413 


0.9884 


[50H52] 


519 


Roll 


call 


us 


Senate 


54 


Political 


voting 


Y 


90 


4,000 


0.9988 


[Soma 


520 


Roll 


call 


us 


Senate 


55 


Political 


voting 


Y 


96 


4,445 


0.9748 


[50H52] 


521 


Roll 


call 


us 


Senate 


56 


Political 


voting 


Y 


93 


4,201 


0.9820 


[50H52] 


522 


Roll 


call 


us 


Senate 


57 


Political 


voting 


Y 


90 


3,939 


0.9835 


[50H52] 


523 


Roll 


call 


us 


Senate 


58 


Political 


voting 


Y 


93 


4,174 


0.9757 


[Soma 


524 


Roll 


call 


us 


Senate 


59 


Political 


voting 


Y 


93 


4,251 


0.9937 


[50H52] 


525 


Roll 


call 


us 


Senate 


60 


Political 


voting 


Y 


95 


4,382 


0.9814 


nam 


526 


Roll 


call 


us 


Senate 


61 


Political 


voting 


Y 


102 


5,033 


0.9771 


[Soma 


527 


Roll 


call 


us 


Senate 


62 


Political 


voting 


Y 


109 


5,719 


0.9716 


EMS] 


528 


Roll 


call 


us 


Senate 


63 


Political 


voting 


Y 


101 


5,029 


0.9958 


[50H52] 


529 


Roll 


call 


us 


Senate 


64 


Political 


voting 


Y 


100 


4,931 


0.9962 


[SoHSa 


530 


Roll 


call 


us 


Senate 


65 


Political 


voting 


Y 


111 


5,899 


0.9663 


[50H52] 


531 


Roll 


call 


us 


Senate 


66 


Political 


voting 


Y 


101 


5,005 


0.9911 


[50H52] 


532 


Roll 


call 


us 


Senate 


67 


Political 


voting 


Y 


105 


5,413 


0.9914 


[50H52] 


533 


Roll 


call 


us 


Senate 


68 


Political 


voting 


Y 


102 


5,081 


0.9864 


[Soma 


534 


Roll 


call 


us 


Senate 


69 


Political 


voting 


Y 


105 


5,353 


0.9804 


[50H52] 


535 


Roll 


call 


us 


Senate 


70 


Political 


voting 


Y 


102 


5,082 


0.9866 


[50H52] 


536 


Roll 


call 


us 


Senate 


71 


Political 


voting 


Y 


109 


5,779 


0.9818 


[Soma 


537 


Roll 


call 


us 


Senate 


72 


Political 


voting 


Y 


103 


5,220 


0.9937 


[50H52] 


538 


Roll 


call 


us 


Senate 


73 


Political 


voting 


Y 


100 


4,879 


0.9857 


[50H52] 


539 


Roll 


call 


us 


Senate 


74 


Political 


voting 


Y 


100 


4,933 


0.9966 


EDH52] 


540 


Roll 


call 


us 


Senate 


75 


Political 


voting 


Y 


102 


5,126 


0.9951 


[50H52] 


541 


Roll 


call 


us 


Senate 


76 


Political 


voting 


Y 


104 


5,106 


0.9533 


[50H52] 


542 


Roll 


call 


us 


Senate 


77 


Political 


voting 


Y 


108 


5,575 


0.9649 


[50H52] 


543 


Roll 


call 


us 


Senate 


78 


Political 


voting 


Y 


104 


5,304 


0.9903 


[Soma 


544 


Roll 


call 


us 


Senate 


79 


Political 


voting 


Y 


107 


5,466 


0.9639 


[50H52] 


545 


Roll 


call 


us 


Senate 


80 


Political 


voting 


Y 


97 


4,655 


0.9998 


[50H52] 


546 


Roll 


call 


us 


Senate 


81 


Political 


voting 


Y 


108 


5,646 


0.9772 


noma 


547 


Roll 


call 


us 


Senate 


82 


Political 


voting 


Y 


98 


4,748 


0.9989 


[50H52] 


548 


Roll 


call 


us 


Senate 


83 


Political 


voting 


Y 


110 


5,724 


0.9548 


[50H52] 


549 


Roll 


call 


us 


Senate 


84 


Political 


voting 


Y 


99 


4,845 


0.9988 


[SoHSa 


550 


Roll 


call 


us 


Senate 


85 


Political 


voting 


Y 


101 


5,014 


0.9929 


[50H52] 
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ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


551 


Roll call: US Senate 86 




Political 


voting 


Y 


103 


5,246 


0.9987 


EDHS2] 


552 


Roll call: US Senate 87 




Political 


voting 


Y 


105 


5,444 


0.9971 


[50H52] 


553 


Roll call: US Senate 88 




Political 


voting 


Y 


103 


5,249 


0.9992 


[50H52] 


554 


Roll call: US Senate 89 




Political 


voting 


Y 


103 


5,247 


0.9989 


[HH52] 


555 


Roll call: US Senate 90 




Political 


voting 


Y 


101 


5,048 


0.9996 


[50H52] 


556 


Roll call: US Senate 91 




Political 


voting 


Y 


102 


5,148 


0.9994 


[Soma 


557 


Roll call: US Senate 92 




Political 


voting 


Y 


102 


5,147 


0.9992 


[50H52] 


558 


Roll call: US Senate 93 




Political 


voting 


Y 


103 


5,246 


0.9987 


EDHS2] 


559 


Roll call: US Senate 94 




Political 


voting 


Y 


101 


5,049 


0.9998 


[50H52] 


560 


Roll call: US Senate 95 




Political 


voting 


Y 


104 


5,345 


0.9979 


[50H52] 


561 


Roll call: US Senate 96 




Political 


voting 


Y 


101 


5,049 


0.9998 


EDHS2] 


562 


Roll call: US Senate 97 




Political 


voting 


Y 


101 


5,049 


0.9998 


[50H52] 


563 


Roll call: US Senate 98 




Political 


voting 


Y 


101 


5,049 


0.9998 


[Soma 


564 


Roll call: US Senate 99 




Political 


voting 


Y 


101 


5,049 


0.9998 


[50H52] 


565 


Roll call: US Senate 100 




Political 


voting 


Y 


101 


5,049 


0.9998 


[50H52] 


566 


Roll call: US Senate 101 




Political 


voting 


Y 


100 


4,950 


1.0000 


[Soma 


567 


Roll call: US Senate 102 




Political 


voting 


Y 


102 


5,148 


0.9994 


[50H52] 


568 


Roll call: US Senate 103 




Political 


voting 


Y 


102 


5,080 


0.9862 


[50H52] 


569 


Roll call: US Senate 104 




Political 


voting 


Y 


103 


5,247 


0.9989 


[SoHSa 


570 


Roll call: US Senate 105 




Political 


voting 


Y 


100 


4,950 


1.0000 


[50H52] 


571 


Roll call: US Senate 106 




Political 


voting 


Y 


102 


5,148 


0.9994 


[50H52] 


571 


Roll call: US Senate 107 




Political 


voting 


Y 


102 


5,148 


0.9994 


[50H52] 


573 


Roll call: US Senate 108 




Political 


voting 


Y 


100 


4,950 


1.0000 


[Soma 


574 


Roll call: US Senate 109 




Political 


voting 


Y 


101 


5,049 


0.9998 


[50H52] 


575 


Roll call: US Senate 110 




Political 


voting 


Y 


102 


5,147 


0.9992 


nam 


576 


UK House of Commons voting 


1992-1997 


Political 


voting 


Y 


668 


220,761 


0.9909 


[53] 


577 


UK House of Commons voting 


1997-2001 


Political 


voting 


Y 


671 


223,092 


0.9925 


m 


578 


UK House of Commons voting 


2001-2005 


Political 


voting 


Y 


657 


215,246 


0.9988 


m 


579 


UN resolutions 1 




Political 


voting 


Y 


54 


1,431 


1.0000 


[54] 


580 


UN resolutions 2 




Political 


voting 


Y 


57 


1,594 


0.9987 


EH 


581 


UN resolutions 3 




Political 


voting 


Y 


59 


1,711 


1.0000 


EH 


582 


UN resolutions 4 




Political 


voting 


Y 


59 


1,711 


1.0000 


[54] 


583 


UN resolutions 5 




Political 


voting 


Y 


60 


1,770 


1.0000 


[54] 


584 


UN resolutions 6 




Political 


voting 


Y 


60 


1,768 


0.9989 


[54] 


585 


UN resolutions 7 




Political 


voting 


Y 


60 


1,770 


1.0000 


EH 


586 


UN resolutions 8 




Political 


voting 


Y 


60 


1,770 


1.0000 


[54] 


587 


UN resolutions 9 




Political 


voting 


Y 


60 


1,770 


1.0000 


EH 


588 


UN resolutions 10 




Political 


voting 


Y 


65 


2,037 


0.9793 


EH 


589 


UN resolutions 11 




Political 


voting 


Y 


81 


3,239 


0.9997 


EH 


590 


UN resolutions 12 




Political 


voting 


Y 


82 


3,317 


0.9988 


EH 


591 


UN resolutions 13 




Political 


voting 


Y 


82 


3,294 


0.9919 


EH 


592 


UN resolutions 14 




Political 


voting 


Y 


82 


3,321 


1.0000 


EH 


593 


UN resolutions 15 




Political 


voting 


Y 


99 


4,851 


1.0000 


EH 


594 


UN resolutions 16 




Political 


voting 


Y 


104 


5,356 


1.0000 


EH 


595 


UN resolutions 17 




Political 


voting 


Y 


110 


5,995 


1.0000 


EH 


596 


UN resolutions 18 




Political 


voting 


Y 


113 


6,246 


0.9870 


EH 


597 


UN resolutions 20 




Political 


voting 


Y 


117 


6,672 


0.9832 


EH 


598 


UN resolutions 21 




Political 


voting 


Y 


122 


7,333 


0.9935 


EH 


599 


UN resolutions 22 




Political 


voting 


Y 


124 


7,616 


0.9987 


EH 


600 


UN resolutions 23 




Political 


voting 


Y 


126 


7,855 


0.9975 


EH 
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ID 


Name 


Category 


Weighted 


N 


L 


fe 


References 


601 


UN resolutions 24 


Political: voting 


Y 


126 


7,851 


0.9970 


[54] 


602 


UN resolutions 25 


Political: voting 


Y 


126 


7,868 


0.9991 


eh 


603 


UN resolutions 26 


Political: voting 


Y 


132 


8,641 


0.9994 


m 


604 


UN resolutions 27 


Political: voting 


Y 


132 


8,646 


1.0000 


EH 


605 


UN resolutions 28 


Political: voting 


Y 


134 


8,905 


0.9993 


EH 


606 


UN resolutions 29 


Political: voting 


Y 


137 


9,202 


0.9878 




607 


UN resolutions 30 


Political: voting 


Y 


143 


10,117 


0.9965 


eh 


608 


UN resolutions 31 


Political: voting 


Y 


144 


10,291 


0.9995 


[54] 


609 


UN resolutions 32 


Political: voting 


Y 


146 


10,585 


1.0000 


EH 


610 


UN resolutions 33 


Political: voting 


Y 


148 


10,878 


1.0000 


m 


611 


UN resolutions 34 


Political: voting 


Y 


150 


11,173 


0.9998 


EH 


612 


UN resolutions 35 


Political: voting 


Y 


151 


11,287 


0.9966 


eh 


613 


UN resolutions 36 


Political: voting 


Y 


155 


11,935 


1.0000 


EH 


614 


UN resolutions 37 


Political: voting 


Y 


156 


12,090 


1.0000 


EH 


615 


UN resolutions 38 


Political: voting 


Y 


157 


12,243 


0.9998 


EH 


616 


UN resolutions 39 


Political: voting 


Y 


158 


12,403 


1.0000 


EH 


617 


UN resolutions 40 


Political: voting 


Y 


158 


12,403 


1.0000 


EH 


618 


UN resolutions 41 


Political: voting 


Y 


158 


12,403 


1.0000 


EH 


619 


UN resolutions 42 


Political: voting 


Y 


158 


12,402 


0.9999 


EH 


620 


UN resolutions 43 


Political: voting 


Y 


158 


12,403 


1.0000 


EH 


621 


UN resolutions 44 


Political: voting 


Y 


158 


12,403 


1.0000 


EH 


622 


UN resolutions 45 


Political: voting 


Y 


154 


11,781 


1.0000 


eh 


623 


UN resolutions 46 


Political: voting 


Y 


168 


13,872 


0.9889 


m 


624 


UN resolutions 47 


Political: voting 


Y 


174 


14,944 


0.9929 


EH 


625 


UN resolutions 48 


Political: voting 


Y 


178 


15,606 


0.9907 


EH 


626 


UN resolutions 49 


Political: voting 


Y 


174 


14,913 


0.9908 


EH 


627 


UN resolutions 50 


Political: voting 


Y 


179 


15,826 


0.9934 


EH 


628 


UN resolutions 51 


Political: voting 


Y 


180 


16,096 


0.9991 


EH 


629 


UN resolutions 52 


Political: voting 


Y 


176 


15,349 


0.9967 


EH 


630 


UN resolutions 53 


Political: voting 


Y 


177 


15,500 


0.9951 


EH 


631 


UN resolutions 54 


Political: voting 


Y 


174 


14,970 


0.9946 


EH 


632 


UN resolutions 55 


Political: voting 


Y 


182 


16,333 


0.9916 


EH 


633 


UN resolutions 56 


Political: voting 


Y 


179 


15,812 


0.9925 


EH 


634 


UN resolutions 57 


Political: voting 


Y 


187 


17,373 


0.9990 


EH 


635 


UN resolutions 58 


Political: voting 


Y 


189 


17,735 


0.9983 


EH 


636 


UN resolutions 59 


Political: voting 


Y 


191 


18,140 


0.9997 


EH 


637 


UN resolutions 60 


Political: voting 


Y 


191 


18,110 


0.9981 


EH 


638 


UN resolutions 61 


Political: voting 


Y 


192 


18,331 


0.9997 


EH 


639 


UN resolutions 62 


Political: voting 


Y 


192 


18,331 


0.9997 


EH 


640 


UN resolutions 63 


Political: voting 


Y 


192 


18,328 


0.9996 


EH 


641 


Biogrid: A. thaliana 


Protein interaction 


N 


406 


625 


0.0076 


EH 


642 


Biogrid: C. elegans 


Protein interaction 


N 


3,353 


6,449 


0.0011 


m\ 


643 


Biogrid: D. melanogaster 


Protein interaction 


N 


7,174 


24,897 


0.0010 


EH 


644 


Biogrid: H. sapien 


Protein interaction 


N 


8,205 


25,699 


0.0008 


EH 


645 


Biogrid: M. musculus 


Protein interaction 


N 


710 


1,003 


0.0040 


m 


646 


Biogrid: R. norvegicus* 


Protein interaction 


N 


121 


135 


0.0186 


EH 


647 


Biogrid: S. cerevisiae 


Protein interaction 


N 


1,753 


4,811 


0.0031 


EH 


648 


Biogrid: S. pombe 


Protein interaction 


N 


1,477 


11,404 


0.0105 


EH 


649 


DIP: H. pylori 


Protein interaction 


N 


686 


1,351 


0.0058 


[SO [57] 


650 


DIP: H. sapien 


Protein interaction 


N 


639 


982 


0.0048 


Esusa 
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N 


L 


fe 


References 


651 


DIP: M. musculus 


Protein interaction 


N 


50 


55 


0.0449 


[SB [57] 


651 


DIP: C. elegans 


Protein interaction 


N 


2,386 


3,825 


0.0013 


[SBEZ] 


653 


Human: Ccsb 


Protein interaction 


N 


1,307 


2,483 


0.0029 


[58] 


654 


Human: Ophid 


Protein interaction 


N 


5,464 


23,238 


0.0016 


[SUED] 


655 


STRING: C. elegans 


Protein interaction 


N 


1,762 


95,227 


0.0614 


m 


656 


STRING: S. cerevisiae 


Protein interaction 


N 


534 


57,672 


0.4053 




657 


Yeast: Oxford Statistics 


Protein interaction 


N 


2,224 


6,609 


0.0027 


m 


658 


Yeast: DIP 


Protein interaction 


N 


4,906 


17,218 


0.0014 


[56] Ell [62] 


659 


Yeast: DIPC 


Protein interaction 


N 


2,587 


6,094 


0.0018 


[56] [57] [62] 


660 


Yeast: FHC 


Protein interaction 


N 


2,233 


5,750 


0.0023 


E3E3] 


661 


Yeast: FYI 


Protein interaction 


N 


778 


1,798 


0.0059 


[HEU 


662 


Yeast: PCA 


Protein interaction 


N 


889 


2,407 


0.0061 


[62] [65] 


663 


Corporate directors in Scotland (1904-1905)* 


Social 


Y 


131 


676 


0.0794 


[161 166] 


664 


Corporate ownership (EVA) 


Social 


N 


4,475 


4,652 


0.0005 


ED 


665 


Dolphins* 


Social 


N 


62 


159 


0.0841 


EH 


666 


Family planning in Korea 


Social 


N 


33 


68 


0.1288 


m 


667 


Unionization in a high-tech firm* 


Social 


N 


33 


91 


0.1723 


ma 


668 


Communication within a sawmill on strike* 


Social 


N 


36 


62 


0.0984 


ED 


669 


Leadership course 


Social 


N 


32 


80 


0.1613 


E3 


670 


Les Miserables* 


Social 


Y 


77 


254 


0.0868 


ED 


671 


Marvel comics 


Social 


Y 


6,449 


168,211 


0.0081 


E2 


672 


Mexican political elite 


Social 


N 


35 


117 


0.1966 


m 


673 


Pretty- good-privacy (PGP) algorithm users 


Social 


N 


10,680 


24,316 


0.0004 


m 


674 


Prisoners 


Social 


N 


67 


142 


0.0642 


E2 


675 


Bernard and Killworth fraternity: observed 


Social 


Y 


58 


967 


0.5850 


E3ED 


676 


Bernard and Killworth fraternity: recalled 


Social 


Y 


58 


1,653 


1.0000 


[ZMZZ] 


677 


Bernard and Killworth HAM radio: observed 


Social 


Y 


41 


153 


0.1866 


EHHH01 


678 


Bernard and Killworth HAM radio: recalled 


Social 


Y 


44 


442 


0.4672 


ESHES1 


679 


Bernard and Killworth office: observed 


Social 


Y 


40 


238 


0.3051 


[ZSHSo] 


680 


Bernard and Killworth office: recalled 


Social 


Y 


40 


779 


0.9987 


|78H80| 


681 


Bernard and Killworth technical: observed 


Social 


Y 


34 


175 


0.3119 


ESHES1 


681 


Bernard and Killworth technical: recalled 


Social 


Y 


34 


561 


1.0000 


EHEq] 


683 


Kapferer tailor shop: instrumental (tl) 


Social 


N 


35 


76 


0.1277 


ED 


684 


Kapferer tailor shop: instrumental (t2) 


Social 


N 


34 


93 


0.1658 


ED 


685 


Kapferer tailor shop: associational (tl) 


Social 


N 


39 


158 


0.2132 


ED 


686 


Kapferer tailor shop: associational (t2) 


Social 


N 


39 


223 


0.3009 


ED 


687 


University Rovira i Virgili (Tarragona) e-mail 


Social 


N 


1,133 


5,451 


0.0085 


E21 


688 


Zachary karate club* 


Social 


N 


34 


78 


0.1390 


ESI 


689 


BA: (100,1)* 


Synthetic 


N 


100 


99 


0.0200 


El 


690 


BA: (100,2)* 


Synthetic 


N 


100 


197 


0.0398 


El 


691 


BA: (1000,1) 


Synthetic 


N 


1,000 


999 


0.0020 


El 


692 


BA: (1000,2) 


Synthetic 


N 


1,000 


1,997 


0.0040 


El 


693 


BA: (500,1) 


Synthetic 


N 


500 


499 


0.0040 


El 


694 


BA: (500,2) 


Synthetic 


N 


500 


997 


0.0080 


El 


695 


ER: (100,0.25)* 


Synthetic 


N 


100 


1,264 


0.2554 


ID 


696 


ER: (100,0.5) 


Synthetic 


N 


100 


2,436 


0.4921 


ID 


697 


ER: (100,0.75) 


Synthetic 


N 


100 


3,697 


0.7469 


ID 


698 


ER: (500,0.25) 


Synthetic 


N 


500 


31,148 


0.2497 


ID 


699 


ER: (500,0.5) 


Synthetic 


N 


500 


62,301 


0.4994 


ID 


700 


ER: (500,75) 


Synthetic 


N 


500 


93,780 


0.7517 


ID 
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ID 


Name 


Category 


Weighted 


N 


L 


J e 


References 


701 


Fractal: 


(10,2,1) 


S5vnt hpfip 


N 


1,024 


9,256 


0.0177 


Bl 


702 


Fractal: 


(10,2,2) 


Svnt hptir 


N 


1,024 


16,875 


0.0322 




703 


Fractal: 


(10,2,3) 


Slvnt, hptip 


N 


1,024 


30,344 


0.0579 


[41 


704 


Fractal: 


(10,2,4) 


\ty~\ f" n pi - 1 c 
\jy nunc uiu 


N 


1,024 




0.1012 


III 


705 


Fractal: 


(10,2,5) 


Svnthptir 


N 


1,024 


89,812 


0.1715 


[41 


706 


Fractal: 


(10,2,6) 


Slvnt.hpfip 


N 


1,024 


147,784 


2822 


[41 


707 


Fractal: 


(10,2,7) 


vn "1" h pi - 1 c 


N 


1 024 


2S2 7Q4 


0.4445 


fil 


708 


Fractal: 


(10,2,8) 


SIvnt hptip 


N 


1,024 


S4S 




[41 


709 


77 1 O A 1_ 1 __ „ 1 _ 

rllo-4 benchmark 


Slvnt.hpfip 


N 


256 


2,311 


0.0708 


l84l 


710 


Lbbx benchmark: (lUUU,lo,oU,U.l,z,z) 


Slvnt.hpt.ip 


N 


1,000 


7 573 


0.0152 


[61 


711 


L/.b K, benchmark: (lUUU,15,5U,U.l,o,l) 


Svnthptir 


N 


1,000 


7,447 


0.0149 


[61 


712 


LJ^K bencnmark: (lUUU,lo,o(J,U.o,z,zJ 


Svnt hptir 


N 


1,000 


7,624 


0.0153 


[61 


713 


T tt^T) "L "U 1/1 nnn i r crn a r o 1 \ 

Lbti bencnmark: (1000,15,50,0.5,3,1) 


Slvnt.hptip 

kj y iiuiivi> iiv^ 


N 


1,000 


7,177 


0.0144 


[61 


714 


T T7 T ) "L ~„ „1„ „ „,1 /innn OCT ETn n 1 O 0\ 

Ll^K bencnmark: (lUUU,zo,5U,U.l,z,zj 


Slvnthptip 

kJj' 11U11C11U 


N 


1,000 


12,739 


0255 


[61 


715 


Lb rt bencnmark: (lUUU,zo,o(J,U.l,o,lJ 


Svnthptir 


N 


1,000 


12,523 


0.0251 


[61 


716 


LI<K bencnmark: (lUUU,zo,5U,U.5,z,zJ 


Slvnt.hpfip 

kj y iiuiivi> iiv^ 


N 


1,000 


12,744 


0255 


[61 


717 


Lr K benchmark: (lUUU,zo,ol),U.o,o,lJ 


Slvnt.Vipfip 

kj y iiuiivi> iiv^ 


N 


1,000 


12,662 


0253 


[61 


718 


t T7 i^^^. ^u^-. ^^,1^. /1 nnn 1 k cn n 1 n 1 1 o 1\ 

LJ^ bencnmark: (lUUU,lo,oU,U.l,U.l,l,z,lJ 


Svnthptir 


Y 


1,000 


7,680 


0.0154 


[81 


719 


t t? v.^-^ ^.u ™ ^ ~i . /innn i k ccn n i n 1 i o o\ 
benchmark: (lUUU,lo,oU,U.l,U.l,l,z,zJ 


Slvnt.hpfip 

kj y iiuiivi> iiv^ 


Y 


1,000 


7,791 


0.0156 


[81 


720 


t 77 i~ i /innniccnncni i o i \ 

L£ bencnmark: (l(JUU,15,5U,U.o,U.l,l,z,lJ 


Slvnt.hpf.ip 

kj y iiuiivi> iiv^ 


Y 


1,000 


7,657 


0.0153 


[81 


721 


t T7 i^^^. ^u^-. ^^,1^. /mnn 1 k ccn n k n 1 o o o\ 

Lb benchmark: (lUUU,lo,oU,U.o,U.l,z,z,zJ 


Svnthptir 


Y 


1,000 


7,912 


0.0158 


[81 


722 


LiJ^ benchmark: (lUUU,lo,oU,U.o,U.o,l,z,lJ 


SIvnt Vipfip 


Y 


1,000 


7,693 


0.0154 


[81 


723 


t t? „u ~ ,„i _ . /innn ice ccn n n n k 1 o o^ 
benchmark: (lUUU,lo,oU,U.o,U.o,l,z,zJ 


Slvnt.Vipfip 

kj y iiuiivi> iiv^ 


Y 


1,000 


7,906 


0.0158 


[81 


724 


T T7 "U ~„ „i„ „„„ „ „,i /i nnn otr en n i ni 1 oi\ 

Lb benchmark: (lUUU,zo,5U,U.l,U.l,l,z,lj 


Svnthptir 


Y 


1,000 


12,660 


0253 


[81 


725 


LF benchmark: (1000,25,50,0.1,0.1,2,2,2) 


Svnthptir 


Y 


1,000 


12,641 


0253 


[81 


726 


LF benchmark: (1000,25,50,0.5,0.1,1,2,1) 


Synthetic 


Y 


1,000 


12,771 


0256 


[H 


727 


LF benchmark: (1000,25,50,0.5,0.1,2,2,2) 


Synthetic 


Y 


1,000 


12,772 


0256 


[81 


728 


LF benchmark: (1000,25,50,0.5,0.5,1,2,1) 


Synthetic 


Y 


1,000 


12,962 


0.0259 


El 


729 


LF benchmark: (1000,25,50,0.5,0.5,2,2,2) 


Slvnt, hptip 


Y 


1,000 


12,881 


0258 


[81 


730 


LF-NG benchmark 


Synthetic 


Y 


128 


1,024 


0.1260 


[HE] 


731 


Random fully-connected: (100) 


Synthetic 


Y 


100 


4,950 


1.0000 




732 


Random fully-connected: (500) 


Synthetic 


Y 


500 


124,750 


1.0000 


L J 


733 


WS: (100,1,0.1) 


Slvnt, hptip 

kj y iiuiivi> iiv^ 


N 


100 


100 


0202 


[21 


734 


WS: (100,1,0.5) 


Svnthptir 


N 


73 


73 


0.0278 


[21 


735 


WS: (100,4,0.1) 


Svnthptir 


N 


100 


407 


0.0822 


[21 


736 


WS: (100,4,0.5) 


Synthetic 


N 


100 


522 


0.1055 


[H 


737 


WS: (1000,1,0.1) 


Synthetic 


N 


850 


850 


0.0024 


[21 


738 


WS: (1000,1,0.5) 


Svnthptir 


N 


877 


877 


0023 


[21 


739 


WS: (1000,4,0.1) 


Synthetic 


N 


1,000 


4,053 


0.0081 


El 


740 


WS: (1000,4,0.5) 


Synthetic 


N 


1,000 


5,138 


0.0103 


m 


741 


KOSKK 


(1000,1,10,10,5 x 10" 5 ,1 x 10" 3 ,100) 


Synthetic 


Y 


519 


2,096 


0.0156 




742 


KOSKK 


(1000,1,10,10,5 x 10~ 5 ,1 x 10~ 3 ,1000) 


Synthetic 


Y 


895 


7,682 


0.0192 




743 


KOSKK 


(1000,1,100,10,5 x 10~ 5 ,1 x 10~ 3 ,1000) 


Synthetic 


Y 


870 


4,725 


0.0125 


m 


744 


KOSKK 


(1000,1,100,105 x 10" 5 ,1 x 10" 3 ,100) 


Synthetic 


Y 


652 


2,125 


0.0100 




745 


KOSKK 


(1000,1,50,10,5 x 10" 5 ,1 x 10" 3 ,100) 


Synthetic 


Y 


459 


1,554 


0.0148 


m 


746 


KOSKK 


(1000,1,50,10,5 x 10~ 5 ,1 x 10~ 3 ,1000) 


Synthetic 


Y 


851 


4,960 


0.0137 
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